System, method, apparatus and diagnostic test for plasmodium vivax

ABSTRACT

A system, method, apparatus and diagnostic test for Plasmodium vivax, to determine a likelihood of a specific timing of infection by P. vivax in a subject, and hence identify individuals with a high probability of being infected with otherwise undetectable liver-stage hypnozoites. The system, method, apparatus and diagnostic test relate to the identification of hypnozoites (“dormant” liver-stages), or at least of the likelihood of the subject being so infected. Optionally and preferably, the specific timing relates to recent infections, for example within the last 9 months.

FIELD OF THE INVENTION

The present invention is of a system, method, apparatus and diagnostictest for relapsing Plasmodium species (i.e Plasmodium vivax andPlasmodium ovale), and in particular, to such a system, method,apparatus and diagnostic test for Plasmodium vivax for characterizing atleast one aspect of infection in a subject or a population of subjects.

BACKGROUND OF THE INVENTION

Plasmodium vivax (P. vivax) is one of five species of parasites thatcause malaria in humans. This disease is marked by severe fever andpain, and can be fatal. The symptoms are caused by the parasite'sinfection, and destruction, of red blood cells in the subject. Infectionof new subjects occurs when infectious mosquitoes take a blood meal fromhumans and inoculate parasites with their saliva.

Like one other species that infects humans, P. ovale, P. vivax has theability to “hide” in the liver of a subject and remain dormant—andasymptomatic—before (re-)emerging to cause renewed bloodstage infectionsand malarial symptoms. Transmission from humans to mosquitoes can onlyoccur when the sexual stages of the parasite (gametocytes) arecirculating in the blood. Liver-stage infection with hypnozoites iscompletely undetectable and asymptomatic, and transmission to mosquitoesis not possible. P. falciparum and P. knowlesi do not have this ability.P. malariae can cause recurrent infections but it remains unclear ifthese infections persist in the bloodstream, the liver or another organ.This ability to hide from the immune system in the liver for prolongedperiods makes P. vivax and P. ovale particularly difficult to detect andtreat.

FIG. 1 shows the overall life cycle of the P. vivax parasite (seeMueller, I. et al. Key gaps in the knowledge of Plasmodium vivax, aneglected human malaria parasite. Lancet Infectious Diseases 9, 555-566(2009)). During a blood meal, a malaria-infected female Anophelesmosquito inoculates sporozoites into the human host (1). Sporozoitesinfect liver cells (2) and either enter a dormant hypnozoite state ormature into schizonts (3), which rupture and release merozoites (4).After this initial replication in the liver (exo-erythrocytic schizogonyA), the parasites undergo asexual multiplication in the erythrocytes(erythrocytic schizogony B). Merozoites infect red blood cells (5). Thering stage trophozoites mature into schizonts, which rupture releasingfurther merozoites into the blood stream (6). Some parasitesdifferentiate into sexual erythrocytic stages (gametocytes) (7). Bloodstage parasites are responsible for the clinical manifestations of thedisease.

The gametocytes, male (microgametocytes) and female (macrogametocytes),are ingested by an Anopheles mosquito during a blood meal (8). Theparasites' multiplication in the mosquito is known as the sporogoniccycle (C). While in the mosquito's stomach, the microgametes penetratethe macrogametes generating zygotes (9). The zygotes in turn becomemotile and elongated (ookinetes) (10) which invade the midgut wall ofthe mosquito where they develop into oocysts (11). The oocysts grow,rupture, and release sporozoites (12), which make their way to themosquito's salivary glands. Inoculation of the sporozoites (1) into anew human host perpetuates the malaria life cycle.

Diagnosis of subjects with P. vivax infections is of paramountimportance to reducing or even eliminating transmission in a population.Such diagnosis would also significantly help individual subjects toreceive proper treatment, including those that have only silentliverstage infections. Given the high degree of population mobilitytoday, particularly in response to natural disasters or human conflicts,accurate and rapid diagnosis of all P. vivax infections has become evenmore important to controlling the disease. In addition, as transmissionin countries decreases (as each population approaches elimination of thedisease), population-level surveillance becomes increasingly important.This surveillance will aid in determining residual areas of transmissionwithin a country, and can also be used to provide evidence for theabsence of transmission indicating that elimination has been achieved.

Some proteins have been very well studied and characterized fordiagnostic purposes. For example, merozoite surface protein 1 (MSP1), inparticular certain C-terminal MSP1-19 fragments and the N-terminalPv200L fragments have been described as suitable diagnostic antigens.Some examples of prior publications related to this protein include U.S.Pat. No. 6,958,235, which focuses on a fragment of this protein fordiagnostic purposes; WO9208795A1, which focuses on this protein fordiagnosis; and US20100119539. Merozoite surface protein 3 (MSP3) isdescribed with regard to a diagnostic tool in U.S. Pat. No. 7,488,489.MSP3.10 [merozoite surface protein 3 alpha (MSP3a)] is described as partof the family of merozoite surface protein 3 like proteins fordiagnostic and other purposes in US20070098738. Rhoptry associatedmembrane antigen is described with regard to a diagnostic tool inEP0372019 B1. Many other proteins were described in relation to theirimmunogenicity and hence their therapeutic utility as part of a vaccine.Some non-limiting examples are given below.

UniProt Annotation¹ Patent information A5K3N8 rhoptry neck protein 2,Vaccine including this protein (US20160158332); putative (RON2)specifically described and claimed for diagnosis in EP2520585, no familymembers, abandoned in 2013 A5KBS6 hypothetical protein, WO2015091734(vaccine) conserved (PvLSA3^(d)) A5K4Z2 apical merozoite U.S. Pat. No.9,364,525 (one of a list of antigens antigen 1 (PvAMA1) for a vaccine,downloaded as US20100150998); WO2006037807 - structure of this antigen;U.S. Pat. No. 7,150,875 - vaccine specifically directed at this antigenA5K0N7 translocon component US20140348870 - Especially preferredantigens are PTEX150, putative post-challenge immunity associatedantigens that (PTEX150) are identified via pre-infection suppressivetreatment, controlled sub-symptomatic infection to develop immunity, andcomparative proteomic differential analysis. WO2010127398 - more focusedon treatment A5KBL6 merozoite surface WO2014186798 - immune stimulation(1 of a long protein 5 list of diseases and antigens); U.S. Pat. No.8,350,019 (focuses on this protein for diagnostic use); WO2015031904 -use of this protein to determine if an individual is protected againstmalaria; WO2016030292 - focused on treatment; US20110020387 - malariavaccine A5K800 MSP7 [merozoite surface EP2990059 - therapeutic butmentions MSP7 protein 7 (MSP7)] specifically A5K736 reticulocyte bindingU.S. Pat. No. 8,703,147 - treatment and prevention protein 2b (RBP2b) ofmalaria A5KAV2 merozoite surface EP2223937 - prevention and treatment ofmalaria; protein 3 (MSP3.3) describes the gene family that includes thisprotein for diagnosis and treatment - EP1689866 A5KAU1 merozoite surfaceUS20140348870 - identified this protein as protein 8, putativeimmunogenic A5K806 thrombospondin-related Immunogenic, part of avaccine: US20100272745, anonymous protein U.S. Pat. No. 7,790,186, U.S.Pat. No. 7,150,875, (PvTRAP/SSP2) WO2013142278, WO2015091734 A5KDR7Duffy receptor mentioned as immunogenic protein, part of a precursor(DBP) vaccine: U.S. Pat. No. 7,790,186 A5KAW0 MSP3.10 [merozoiteUS20070098738 - describes entire protein family; surface proteinUS707129 - describes various members of this 3 alpha (MSP3a)] family asbeing immunogenic

Still other proteins have barely been described or characterized in theliterature. In some cases, these proteins have not yet been describedwith regard to their stage in the P. vivax life cycle. In other cases,an initial determination of the stage has been made but their diagnosticor therapeutic utility is not known. A non-limiting list of some ofthese proteins is provided below. A further list is provided with regardto Appendix I, although optionally any annotated proteins from P. vivaxin Uniprot (http://www.uniprot.org/uniprot/) or another suitable proteindatabase could be included.

Uniprot Protein name A5K7E7 hypothetical protein, conserved A5K482hypothetical protein, conserved A5K0Q6 hypothetical protein, conservedA5K4N0 hypothetical protein, conserved A5KAP7 hypothetical protein,conserved A5K4I6 hypothetical protein, conserved A5K659 hypotheticalprotein, conserved A5KB45 hypothetical protein, conserved

Very few attempts have been made to characterize the life cycle of theparasite within the body for diagnostic purposes, in terms of thedynamics of the proteins or antibody responses to specific proteinspresent in the blood. For example, an assay for determining a state ofprotective immunity is described in US20160216276. However, thedisclosure relates to diagnostic assays for identifying individuals thatare protected against Plasmodium falciparum caused malaria. As notedabove, P. falciparum does not have a dormant liver stage withlong-latency giving rise to relapses. This patent application does notmention P. vivax.

Other prior art disclosures for diagnostics focus only on the bloodstage of P. vivax, which again prevents a complete picture of thedynamics of the infection in the subject from being determined. U.S.Pat. No. 6,231,861 and US20090117602 both suffer from this deficiency.

In other cases, where a plurality of antigens were examined for malarialdiagnostics of P. vivax, the results still did not provide a completepicture of the dynamics of the infection in the subject. For example,“Genome-Scale Protein Microarray Comparison of Human Antibody Responsesin Plasmodium vivax Relapse and Reinfection” (Chuquiyauri et al; Am. J.Trop. Med. Hyg., 93(4), 2015, pp. 801-809) suffered from the followingdrawbacks:

i) It only features antibody signatures that differentiate betweenblood-stage infections that are thought to stem either from directinfections or relapsing infections;

ii) The phenotypes are of poor quality because they are focused ongenotyping with only one gene, so may overestimate the number of newinfections vs relapses;

iii) They are only looking at the presence and titer of antigens at onetimepoint (i.e. at the time of infection).

In another example, “Serological markers to measure recent changes inmalaria at population level in Cambodia” (Kerkhof et al; MalariaJournal, 15 (1), 2016, pp. 529, the authors calculated estimatedantibody half-lives to 19 Plasmodium proteins, including 5 P. vivaxproteins. These 5 proteins are well-known vaccine candidates (CSP, AMA1,EBP, DBP and MSP1), and the work included no formal antigendown-selection. A major limitation of this study is that individualswere only assessed for malaria prevalence every 6 months, and hence theestimated half-lives are not a true biological reflection of what occursin the absence of re-infection. The authors only identified one P. vivaxantigen, EBP, that had an estimated antibody half-life of less than 2years.

BRIEF SUMMARY OF THE INVENTION

The present invention, in at least some embodiments, is of a system,method, apparatus and diagnostic test for Plasmodium vivax, to determinea likelihood of a specific timing of infection by P. vivax in a subject,and hence identify individuals with a high probability of being infectedwith otherwise undetectable liver-stage hypnozoites. According to atleast some embodiments, the system, method, apparatus and diagnostictest relate to the identification of hypnozoites (“dormant”liver-stages), or at least of the likelihood of the subject being soinfected. Optionally and preferably, the specific timing relates torecent infections, for example within the last 9 months. Without wishingto be limited by a closed list, the present invention is able toidentify such recent infections, and not just current infections.

Non-limiting examples of elapsed time periods since an infection includetime since infection ranging from 0 to 12 months, and each time periodin between, by month, by week, and/or by day. Optionally and preferablya particular time period is determined as a binary decision of a morerecent or an older infection, with each time point as a cut-off. As anon-limiting example, one such cut off could determine whether aninfection in a subject was within the past 9 months or later than thepast 9 months.

Optionally the timing of such an infection may also be determined, suchthat one or more of the following parameters may be determined. One suchparameter is optionally whether the infection is a first infection inthe patient, of P. vivax generally or of a particular strain of P.vivax. As there is no sterilizing immunity in malaria, immunity to onestrain does not necessarily confer immunity to another, differentstrain. However, as described in greater detail below with regard to theexamples, the present invention was tested by using samples from threedifferent regions (including Brazil, Thailand and the Solomon Islands).These three populations are genetically highly diverse and represent themajor part of the global genetic variation in P. vivax. Consequently,the present inventors believe, without wishing to be limited by a singlehypothesis, that it will work anywhere in the world. Other parametersrelate to time elapsed from the previous infection.

According to at least some embodiments, the antibody measurements mayoptionally be used to provide an estimation of elapsed time since lastinfection. An estimate of the time since last P. vivax blood-stageinfection—depending on the available calibration data—can be definedeither as the time since last PCR-detectable blood-stage parasitemia, oras the time since last infective mosquito bite. Time since lastinfection can be estimated continuously or categorically. Concurrentestimation of uncertainty will be important.

According to at least some embodiments, the antibody measurements mayoptionally be used to provide a determination of medium-term serologicalexposure, for example a frequency of infections during a particular timeperiod and/or time since last infection.

According to at least some embodiments, there is provided a system,method, apparatus and diagnostic test for detection of a “silent”(asymptomatic or presymptomatic) infection by P. vivax.

According to at least some embodiments, there is provided a system,method, apparatus and diagnostic test for detection of a dormantinfection, in which P. vivax is present in the liver but is not presentat detectable levels in the blood. As described herein, detection of adormant infection optionally comprises prediction from an indirectmeasurement of an antibody level.

During the life cycle of P. vivax, blood-stage forms of the parasite caninitially be present at the same time as arrested liver forms, asdescribed in the Background of the Invention. Even after the blood-stageinfection has cleared, hypnozoites can still be present in the liver,and the parasite may still be indirectly detected via persistingantibody responses against the primary blood-stage infection. Accordingto at least some embodiments, there is provided a system, method,apparatus and diagnostic test for detection of antibodies to malarialproteins that are present in the blood that indicate a high degree ofprobability of liver-stage infection.

According to at least some embodiments, there is provided a system,method, apparatus and diagnostic test for determination of theprogression of infection by P. vivax in a population of a plurality ofsubjects. Optionally, it is possible to determine the rate ofpropagation of a particular Plasmodium species in a population notpreviously exposed to that species.

With regard to the diagnostic test, in at least some embodiments, thereis provided a plurality of antibodies that bind to a plurality ofantigens in a blood sample taken from the subject. Optionally anysuitable tissue biological sample from a subject may be used fordetecting a presence and/or level of a plurality of antibodies.

According to at least some embodiments, the dynamics of the measuredantibodies preferably include a combination of short-lived andlong-lived antibodies. Without wishing to be limited by a singlehypothesis or a closed list, such a combination is useful to reducemeasurement error.

Optionally the level of antibodies is measured at one time point or aplurality of time points.

Optionally, the presence of the actual antibodies in the blood of thesubject is measured at a plurality of time points to determine decay inthe level of the antibody in the blood. Such a decay in the level isthen optionally and preferably fitted to a suitable model as describedherein, in order to determine at least one of the infection parametersas described above. More preferably, decay of the level of a pluralityof different antibodies is measured. Optionally and more preferably, thedifferent antibodies are selected to have a range of differenthalf-lives. Optionally, a maximum number of different antibodies ismeasured, which is optionally up to 20 or as few as two, or any integralnumber in between. According to at least some embodiments, the number ofantibodies is preferably 4 or 8.

According to at least some e rtbodiments, the level is measured of atleast one antibody to a protein selected from the group consisting of:PVX_099980, PVX_112670, PVX_087885, PVX_082650, PVX_088860, PVX_112680,PVX_112675, PVX_092990, PVX_091710, PVX_117385, PVX_098915, PVX_088820,PVX_117880, PVX_121897, PVX_125728, PVX_001000, PVX_084340, PVX_090330,PVX_125738, PVX_096995, PVX_097715, PVX_094830, PVX_101530, PVX_090970,PVX_084720, PVX_003770, PVX_112690, PVX_003555, PVX_094255, PVX_090265,PVX_099930, PVX_123685, PVX_002550, PVX_082700, PVX_097680, PVX_097625,PVX_082670, PVX_082735, PVX_082645, PVX_097720, PVX_000930, PVX_094350,PVX_099930, PVX_114330, PVX_088820, PVX_080665, PVX_092995, PVX_087885,PVX_003795, PVX_087110, PVX_087670, PVX_081330, PVX_122805, RBP1b (P7),RBP2a (P9), RBP2b (P25), RBP2cNB (M5), RBP2-P2 (P55), PvDBP R3-5,PvGAMA, PvRipr, PvCYRPA, Pv DBPII (AU), PvEBP, RBP1a (P5) and Pv DBP(SacI).

Preferably, the level is measured of at least one antibody to a proteinselected from the group consisting of PVX_099980, PVX_112670,PVX_087885, PVX_082650, PVX_088860, PVX_112680, PVX_112675, PVX_092990,PVX_091710, PVX_117385, PVX_098915, PVX_088820, PVX_117880, PVX_121897,PVX_125728, PVX_001000, PVX_084340, PVX_090330, PVX_125738, PVX_096995,PVX_097715, PVX_094830, PVX_101530, PVX_090970, PVX_984720, PVX_003770,PVX_112690, PVX_003555, PVX_094255, PVX_090265, PVX_099930 andPVX_123685.

More preferably, the level is measured of at least one antibody to aprotein selected from the group consisting of PVX_099980, PVX_112670,PVX_087885, PVX_082650, PVX_096995, PVX_097715, PVX_094830, PVX_101530,PVX_090970, PVX_084720, PVX_003770, PVX_112690, PVX_003555, PVX_094255,PVX_090265, PVX_099930 and PVX_123685.

Most preferably, the level is measured of at least one antibody to aprotein selected from the group consisting of PVX_099980, PVX_112670,PVX_087885 and PVX_082650.

According to at least some embodiments, preferably the level is measuredof at least one antibody to a protein selected from the group consistingof RBP2b, L01, L31, X087885, PvEBP, L55, PvRipr, L54, L07, L30, PvDBPII,L34, X092995, L12, RBP1b, L23, L02, L32, L28, L19, L36, L41, X088820 andPvDBP . . . SacI.

More preferably the level is measured of at least one antibody to aprotein selected from the group consisting of RBP2b, L01, L31, X087885,PvEBP, L55, PvRipr, L54, L07, L30, PvDBPII, L34, X092995, L12 and RBP1b.

Also more preferably the level is measured of at least one antibody to aprotein selected from the group consisting of RBP2b, L01, L31, X087885,PvEBP, L55, PvRipr and L54.

Most preferably the level is measured of at least one antibody to aprotein selected from the group consisting of RBP2b and L01.

A table containing additional proteins against which antibodies mayoptionally be measured is provided herein in Appendix I, as described ingreater detail below, such that the level of one or more of theseantibodies may optionally be measured.

Appendix II gives a list of preferred proteins against which antibodiesmay be measured, while Appendix III shows a complete set of data forantibodies against the proteins in Appendix II. Appendix III is given intwo parts, due to the size of the table: Appendix IIIA and AppendixIIIB. The references to gene identifiers in Appendix II are the commonones used for Plasmodium—from PlasmoDB website:http://plasmodb.org.plasmo/.

For any protein described herein, optionally a fragment and/or variantmay be used for detecting the presence and/or level of one or moreantibodies in a biological sample taken from a subject.

According to at least some embodiments, a biologically-motivated modelof the decay of antibody titers over time is used to determine astatistical inference of the time since last infection. The modelpreferably uses previously determined decay rates of a plurality ofdifferent antibodies to determine a likelihood that infection in thesubject occurred within a particular time period. Optionally suchpreviously determined decay rates may be achieved through estimation ofantibody decay rates from longitudinal data, or estimation of decayrates from cross-sectional antibody measurements.

With regard to estimation of antibody decay rates from longitudinaldata, preferably such an estimation is performed as described inequation (1), which is a mixed-effects linear regression model:

log(A_(ijk))˜(α_(k) ⁰+α_(ik))+(r_(k) ⁰+r_(ik))t_(j)+ε_(k)

α_(ik)˜N(0, σ_(α,k))

r_(ik)˜N(0, σ_(r,k))

ε_(k)˜N(0, σ_(m,k))   Equation 1

For the above equation to be true, the following assumptions were made.We assume that for individual i we have measurements of antibody titerA_(ijk) at time j to antigen k. We assume that at time 0, antibodytiters are Normally distributed5 with mean α_(k) ⁰ and standarddeviation σ_(α,k) on a log-scale. We assume that an individual's rate ofantibody decay is drawn from a Normal distribution with mean r_(k) ⁰ andstandard deviation σ_(r,k).

According to at least some embodiments, the plurality of differentantibodies selected maximizes probability of determining at least one ofthe infection parameters as described above. A method for such aselection process is described below in Example 3, Optionally theplurality of antibodies is selected for determining an answer to abinary determinant, such as for example, whether an individual wasinfected before x months ago or after as previously described.

According to at least some embodiments, the model for determining atleast one parameter about the infection in the subject may optionallycomprise one or more of the following algorithms: linear discriminantanalysis (LDA), quadratic discriminant analysis (ODA), combined antibodydynamics (CAD), decision trees, random forests, boosted trees andmodified decision trees.

According to at least some embodiments, the levels of antibody in ablood-sample can be measured and summarized in a variety of ways, forinput to the above described model.

a) Continuous Measurement

A continuous measurement that has a monotonic relationship with antibodytiter. It can be compared with a titration curve to produce an estimateof antibody titer.

b) Binary Classification

Assesses whether antibody levels are greater or less than some threshold

c) Categorical Classification

Assigns antibody levels to one of a set of pre-defined categories, e.g.low, medium, high. A categorical classification can be generated via aseries of binary classifications.

According to at least some embodiments, antibody levels may optionallyhe measured in a subject in a number of different ways, including butnot limited to, bead-based assays (e.g. AlphaScreen® or Luminex®technology), the enzyme linked immuosorbent assay (ELISA), proteinmicroarrays and the luminescence immunoprecipitation system (LIPS). Allthe aforementioned methods generate a continuous measurement ofantibody.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a background art description of the lifecycle of P. vivax(see Mueller, I. et al. Key gaps in the knowledge of Plasmodium vivax, aneglected human malaria parasite. Lancet Infectious Diseases 9, 555-566(2009)).

FIG. 2 shows a method for data processing and down-selection ofcandidate serological markers.

FIG. 3 shows an example of two differing antibody kinetic profiles.Antibody responses at the four time-points measured in the AlphaScreen®assay are shown for two proteins, PVX_099980 and PVX_122680. Anarbitrary positivity cut-off is marked at 0.94 (the average of the wheatgerm extract control well+6×standard deviation). Data is generated from32 individuals in Thailand.

FIG. 4 shows characteristics of the top 55 protein constructs. (A)Length of the estimated antibody half-lives, note for 4 proteins theclassification was different between Thailand and Brazil. (B)-(F)Details of protein characteristics as determined by PlasmoDB release 25or published literature: (B) predicted expression stage, (C) presence ofa signal peptide sequence, (D) presence of transmembrane domain/s, (E)presence of a GPI anchor, (F) annotation. TM=transmembrane domains,MSPs=merozoite surface proteins, RBPs reticulocyte binding proteins.

FIG. 5 shows correlation between antibody measurements in Thailand andBrazil. Correlation of data from the antigen discovery study generatedusing the Alpha Screen® assay. Correlations are shown for the 55down-selected candidate serological markers. (A) Comparison of theproportion of individuals defined as positive at time of P. vivaxinfection (antibody value above the lower point of the standard curve,i.e. 0). (B) Comparison of the geometric mean antibody titers (GMT). (C)Comparison of the estimated antibody half-lives. Spearman correlationcoefficients, r, are shown. Data was generated from 32 individuals inThailand and 33 in Brazil.

FIG. 6A shows optimization of Luminex® bead-array assay for the first 17proteins. Log-linear standard curves were achieved for all proteins,using the amounts of protein shown for one bulk reaction of 500 μlbeads.

FIGS. 6B-6D show additional development and optimization of the Luminexbead-array assay for all 65 proteins assessed in the validation study asfollows. FIG. 6B shows 40 down-selected proteins. FIG. 6C shows theremaining 25 proteins. Log-linear standard curves were achieved for allproteins. The amount of protein for one bulk reaction of 500 ul beads isshown in FIG. 6D, with the line indicating the median (1 and 1.08 ug,respectively).

FIG. 7 shows the association of antibody levels with current P. vivaxinfections in the Thai validation cohort. Antibody responses weremeasured at the last time-point of the Thai cohort against the first 17proteins assessed, using the Luminex® bead-array assay. The associationbetween antibody responses and current infection was assessed using alogistic regression model, adjusting for age, sex and occupation. Oddsratios are shown, with 95% confidence intervals. Associations for allantibodies were significant (p<0.05). The estimate of antibody half-lifeshown is based on the antigen discovery dataset (AlphaScreen®).

FIG. 8 shows association of antibody levels with past P. vivax exposurein the Thai validation cohort. Antibody responses were measured at thelast time-point of the Thai cohort against the first 17 proteinsassessed, using the Luminex® bead-array assay. The association betweenantibody responses and total exposure over the past year was assessedusing a generalised linear model, adjusting for age, sex, occupation andcurrent infection status. Exponentiated coefficients are shown, with 95%confidence intervals. Associations for all antibodies, except PVX_09070,were significant (p<0.05). The estimate of antibody half-life shown isbased on the antigen discovery dataset (AlphaScreen®).

FIG. 9 shows the association of antibody levels with current P. vivaxinfections in the Brazilian validation cohort. Antibody responses weremeasured at the last time-point of the Brazilian cohort against thefirst 17 proteins assessed, using the Luminex® bead-array assay. Theassociation between antibody responses and current infection wasassessed using a logistic regression model, adjusting for age, sex andoccupation. Odds ratios are shown, with 95% confidence intervals.Associations for all antibodies, except PVX_088860, were significant(p<0.05). The estimate of antibody half-life shown is based on theantigen discovery dataset (AlphaScreen®).

FIG. 10 shows the association of antibody levels with past P. vivaxexposure in the Brazilian validation cohort. Antibody responses weremeasured at the last time-point of the Brazilian cohort against thefirst 17 proteins assessed, using the Luminex® bead-array assay. Theassociation between antibody responses and total exposure over the pastyear was assessed using a generalised linear model, adjusting for age,sex, occupation and current infection status. Exponentiated coefficientsare shown, with 95% confidence intervals. Associations for 10 of the 17antibodies were significant p<0.05). The estimate of antibody half-lifeshown is based on the antigen discovery dataset (AlphaScreen®).

FIG. 11 shows longitudinal antibody dynamics of 4 antigens from 8 Thaiparticipants in the antigen discovery cohort. For each blood sampleantibody titers were measured in triplicate, using the AlphaScreen®assay. Each colour corresponds to antibodies to a different antigen. Thelines represent the fit of the mixed-effects regression model describedbelow.

FIG. 12 shows the relationship between antibody titers to 8 P. vivaxantigens and time since last PCR-detectable in individuals from amalaria-endemic region of Thailand (validation study, antibodiesmeasured via Luminex® bead-array assay). The grey bars denoteindividuals with current infection (n=25); infection within the last 9months (n=47); infection 9-14 months ago (n=25); and no infectiondetected within the last 14 months (n=732). The orange bars show theantibody titers from three different panels of negative controls.

FIG. 13 presents the association between measured antibody titer x_(ik)and time since infection t. (a) There are three sources of variation inthe antibody titer x_(ik) measured at time t since last infection: (i)variation in initial antibody titer; (ii) between individual variationin antibody decay rate; and (iii) measurement error. (b) Given estimatesof the sources of variation, we can estimate the distribution of thetime since last infection. The maximum likelihood estimate and the 95%confidence intervals of our estimate are indicated in blue.

FIG. 14 shows the dynamics of multiple antibodies. The variance ininitial antibody titer, antibody decay rates and measurement error arenow described by covariance matrices, which account for the correlationsbetween antibodies.

FIG. 15 shows an example of QDA classification for participants from theThai validation cohort. Antibody measurements were made using theLuminex® bead-array assay. Each point corresponds to a measurement froma single individual with log(anti-L01 antibody titer) on the x-axis andlog(anti-L22 antibody titer) on the y-axis. The blue ellipse representsthe multivariate Gaussian fitted to data from individuals with ‘old’infections and the red ellipse represents the multivariate Gaussionfitted to data from individuals with ‘new’ infections. The dashed lackline represents the boundary for classifying individuals according towhether or not they have had a recent infection.

FIG. 16 shows receiver operator characteristic (ROC) curves estimatedvia cross-validation for LDA (blue) and QDA (black) classificationalgorithms, using the Thai validation data measured via the Luminex®bead-array assay.

FIG. 17 shows an example of a decision tree for classifying old versusnew infections using measurements of antibodies to 6 P. vivax antigens,using the Thai validation data measured via the Luminex® bead-arrayassay.

FIG. 18 shows ROC curve demonstrating the association betweensensitivity and specificity for a decision tree algorithm, using theThai validation data measured via the Luminex® bead-array assay. Thesecurves have been generated through cross-validation by splitting thedata into training and testing sets. The algorithm is formulated usingthe training data set and the sensitivity and specificity evaluated onthe testing data set. The colours correspond to different subsets ofantigens. Notably, we can obtain nearly 80% sensitivity with specificity>95%.

FIG. 19 shows a random forest variable importance plot of thecontribution of antibodies to 17 antigens towards correct classificationof infections into ‘new’ versus ‘old’, using the Thai validation datameasured via the Luminex® bead-array assay. Antigens with greater valuesof ‘MeanDecreaseAccuracy’ are considered the most informative. ThereforeL01 provides the most information for classification purposes.

FIG. 20 shows an example of antigen down-selection using the simulatedannealing algorithm. Data comes from the antigen discovery study usingthe AlphaScreen® assay. (A) Including additional antigens increases thelikelihood that infection times will be correctly classified, but withdiminishing returns. (B) Each column of the heatmap denotes one of K=98antigens. The y-axis denotes the maximum number of antigens that can beincluded in a panel. Red antigens are more likely to be included in apanel of a given size. (C) Example of predicting time since lastinfection in 4 individuals using a panel of 15 antigens. The verticaldashed line at 6 months represents an infection occurring 6 months ago.The solid black curve denotes the estimated distribution of the timesince last infection. The green point denotes the maximum likelihoodestimate of the model, and the vertical green bars denote the 95%confidence intervals. The red, shaded area denotes infection within thelast 9 months. If more than 50% of the probability mass of thedistribution is in this region, then the infection will be classified ashaving occurred within the last 9 months.

FIG. 21 shows comparison of age-stratified prevalence of PCR detectableblood-stage infection within the last 9 months;

FIG. 22 shows measured antibody titers to four P. vivax antigens fromThailand, Brazil and the Solomon Islands, and from three panels ofnegative controls. The box plots show the median, interquartile rangeand 95% range of measured antibody titers. The horizontal dashed linesrepresent the lower and upper limits of detection;

FIGS. 23A-23C show an overview of cross-validated random forestsclassification algorithms. The classifiers were trained on data fromeither Thailand, Brazil or The Solomon Islands; and

FIG. 24 shows an exemplary network visualization of combinations of 4antigens. The size of the node represents the probability that anantigen appears in the best performing combinations. The width anddarkness of the edges represents the probability that two antigens areselected together in the best performing combinations. Red denotesproteins purified at high yield by CellFree Sciences (the 40 downselected proteins, the results for which are shown in FIG. 6B). Bluedenotes vaccine candidate antigens. Green denotes proteins expressed inwheat-germ by Ehime University. Blue and green proteins are the 25additional proteins, the results for which are shown in FIG. 6C.

FIG. 25 shows cross-validated Receiver Operating Characteristic (ROC)curves from linear discriminant analysis (LDA) classifiers trained andtested using combinations of four antigens from Thailand, Brazil and TheSolomon Islands.

Description of at Least Some Embodiments

The present invention, in at least some embodiments, is of a system,method, apparatus and diagnostic test for at least Plasmodium vivax, andoptionally other species such as P. ovale, to determine a likelihood ofa concurrent or the specific timing of a recent past infection by P.vivax in a subject, and hence identify individuals with a highprobability of being infected with otherwise undetectable liver-stagehypnozoites. According to at least some embodiments, the system, method,apparatus and diagnostic test relate to the identification ofhypnozoites (“dormant” fiver-stages), or at least of the likelihood ofthe subject being so infected. Optionally and preferably, the specifictiming relates to recent infections, for example within the last 9months. Without wishing to be limited by a closed list, the presentinvention is able to identify such recent infections, and not justcurrent infections.

According to at least some embodiments, the antibody measurements mayoptionally be used to provide an estimation of elapsed time since lastinfection. An estimate of the time since last P. vivax blood-stageinfection—depending on the available calibration data, the time sincelast infection can be defined either as the time since lastPCR-detectable blood-stage parasiternia, or as the time since lastinfected mosquito bite. Time since last infection can be estimatedcontinuously or categorically. Concurrent estimation of uncertainty willbe important.

According to at least some embodiments, the antibody measurements mayoptionally be used to provide a determination of medium-term serologicalexposure, for example a frequency of infections during a particular timeperiod and/or time since last infection.

According to at least some embodiments, there is provided a system,method, apparatus and diagnostic test for detection of a “silent”(asymptomatic or presymptomatic) infection by P. vivax.

Protein Nomenclature

Throughout the below experiments, simplified names have been used forthe proteins assessed. In the antigen discovery experiments using theAlphaScreen® assay, 342 proteins were assessed. These proteins weregiven codes consisting of single letters followed by 2 numbers in mostinstances, and on occasion 3 numbers.

In the validation experiments using the multiplexed assay (Luminex®technology), 40 proteins (out of the 53 potential candidatesdown-selected) were assessed. These proteins have been given codesbeginning with ‘L’ followed by 2 numbers. These proteins weresupplemented by an additional 25 proteins expressed in a variety ofsystems. These proteins have been given codes beginning with ‘V’ or ‘X’followed by 2 numbers. The codes used for the tested candidates areoutlined below, as well as in Appendix II, in reference to theirPlasmoDB gene ID (plasmodb.org).

PlasmoDB ID AlphaScreen Luminex PVX_099980 D10 L01 PVX_096995 J12 L02PVX_088860 L19 L03 PVX_097715 N17 L07 PVX_112680 K21 L06 PVX_094830 N13L10 PVX_112675 B19 L11 PVX_112670 G21 L12 PVX_101530 D21 L05 PVX_090970E10 L14 PVX_084720 B8 L18 PVX_003770 P17 L19 PVX_092990 H14 L20PVX_112690 K10 L21 PVX_091710 F13 L22 PVX_087885 N9 L23 PVX_003555 O21L24

A complete list of all sequences considered, plus the sequencesthemselves, may be found in Appendices I and II combined. Thesesequences include the reference to the amino acid and nucleic acidsequence records of the relevant antigens, plus actual sequencesgenerated for testing. The actual amino acid sequences generated fortesting include a methionine at the start (N-terminus) and a His-tag atthe end (C-terminus) as non-limiting examples only. The nucleic acidsequences so generated correspond to these amino acid sequences. Itshould be noted that the sequences listed are intended as non-limitingexamples only, as different sequences and/or different antigens mayoptionally be used with the present invention, additionally oralternatively. The amino acid sequences for the specific proteinsreferred to herein may optionally be obtained from Uniprot or anothersuitable protein database.

EXAMPLE 1 Testing of Antigens

This non-limiting Example relates to testing of antibody responses tovarious P. vivax proteins, present in the blood, as potential antigensfor a diagnostic test.

Materials and Methods Ethics Statement.

The relevant local ethics committees approved all field studies and allpatients gave informed consent or assent. The Ethics Committee of theFaculty of Tropical Medicine, Mahidol University, Thailand approved theThai antigen discovery and validation studies (MUTM 2014-025-01 and 02,and MUTM 2013-027-01, respectively). The Ethics Review Board of theFundação de Medicina Tropical Dr. Heitor Vieira Dourado (FMT-HVD)(957.875/2014) approved the Brazilian antigen discovery study. Thesamples used from Brazil for the validation study were approved by theFMT-HVD (51536/2012), by the Brazilian National Committee of Ethics(CONEP) (349.211/2013) and by the Ethics Committee of the HospitalClinic, Barcelona, Spain (2012/7306). The National Health Research andEthics Committee of the Solomon Islands Ministry of Health and MedicalServices (HRC12/022) approved collection of the samples used from theSolomon Islands for the validation study. The Human Research EthicsCommittee at WEHI approved samples for use in Melbourne (#14/02),

Field Sites and Sample Collection: Antigen Discovery Study.

Samples from two longitudinal cohorts, located in Thailand and Brazil,were used for the antigen discovery studies. The longitudinal study inThailand was conducted from April 2014 to September 2015, as previouslydescribed (Longley et al., Am J Trop Med Hyg. 2016 Nov. 2;95(5):1086-1089). Briefly, 57 symptomatic P. vivax patients wereenrolled from either the Tha Song Yang malaria clinic or hospital.Patients with glucose-6-phosphate dehydrogenase (G6PD) deficiency andthose aged younger than 7 years or more than 80 years were excluded.Patients were treated with chloroquine (25 mg base/kg body weight,administered over 3 days) and primaquine (15 mg daily, for 14 days)according to the standard Thai treatment regimen. Anti-malarial drugswere given under directly-observed treatment in order to reduce thelikelihood of treatment failure and the presence of recurrent infectionsduring follow-up. Volunteers were followed for 9-months followingenrolment, with finger-prick blood samples collected at enrolment andweek 1, then every 2 weeks for 6 months, then every month until the endof the study. Blood was separated into packed red cells and plasma atthe field site. All blood samples were analysed by both light microscopyand quantitative PCR (qPCR) for the presence of blood-stage parasites. Asub-set of volunteers, n=32, were selected for use in the antigendiscovery project. These volunteers had no detectable recurrentinfections during 9-months follow-up, and were the first to completefollow-up.

The longitudinal study in Brazil followed the same format as inThailand. The study was conducted from May 2014 to May 2015. 91 malariapatients at Fundação de Medicina Tropical Doutor Heitor Vieira Douradoin Manaus aged between 7 and 70 years were enrolled. Individuals withG6PD deficiency or chronic diseases were not enrolled. Patients weretreated according to the guidelines of the Brazilian Ministry of Health(3 days chloroquine, 7 days primaquine). Follow-up intervals withfinger-prick blood sample collection were as in the Thai study. Asub-set of volunteers, n=33, whom had no detectable recurrent infectionsduring 9-months follow-up, were selected for use in the antigendiscovery project.

Field Sites and Sample Collection: Validation Study.

For the validation studies, samples collected from four observationallongitudinal cohort studies, conducted in Thailand, Brazil and theSolomon Islands, were used (data from the Solomon Islands not shown).Samples were collected from a cohort of volunteers every month for 1year. Plasma samples from the final cohort time-point were used in thevalidation study, n=829 Thailand, n=925 Brazil, and n=751 SolomonIslands.

The Thailand observational cohort was conducted from May 2013 to June2014 in the Kanchanaburi and Ratchaburi provinces of western Thailand.The design of this study has been published (Longley et al, Clin VaccineImmunol. 2015 Dec. 9; 23(2):117-24). Briefly, a total of 999 volunteerswere enrolled (aged 1-82 years, median 23 years). Volunteers weresampled every month over the yearlong cohort, with 14 active casedetection visits performed in total. A total of 609 volunteers attendedall visits, with 829 attending the final visit. At each visit,volunteers completed a brief survey outlining their health over the pastmonth (to determine the possibility of missed malarial infections), inaddition to travel history and bed net usage. A finger-prick bloodsample was also taken and axillary temperature recorded. Blood sampleswere separated into packed red blood cells, for detection of malariaparasites, and plasma, for antibody measurements, at the field sites. Inaddition to the monthly active case detection visits, passive casedetection was also performed routinely by local malaria clinics.

The Brazilian observational cohort was conducted from April 2013 toApril 2014 in three neighbouring communities located on the outskirts ofManaus, Amazonas State. Briefly, a total of 1274 residents of all agegroups were enrolled (range 0-102 years, median 25 years). Volunteerswere sampled every month over the yearlong period, with 13 active casedetection visits performed in total. At each visit a finger-prick bloodsample was collected, with the exception of children aged less than onein which blood was collected from the heel or big toe. As per the Thaicohort study, at each visit body temperature was also recorded and aquestionnaire undertaken outlining the participants' health, bed netusage and travel history. Passive case detection was performed routinelyby local health services. Blood samples were processed as per the Thaicohort. Plasma samples from 925 volunteers were available from the finalvisit.

The Solomon islands observational cohort was conducted from May 2013 toMay 2014 in 20 villages on the island of Ngella, Solomon Islands. 1111children were initially enrolled, and after exclusion of children whowithdrew, had inconsistent attendance or failed to meet other inclusioncriteria, 860 remained (Quah Waltmann, in preparation). The age of thechildren ranged from 6 months to 12 years, with a median age of 5.6years. Over the yearlong cohort, children were visited approximatelymonthly, with 11 active case detection visits in total. Of the 860children, 751 attended the final visit. At each visit, a brief surveywas conducted as per the Thai cohort, temperature recorded and afinger-prick blood sample taken. Blood was separated into packed redcells for qPCR and plasma for antibody measurements. In addition to themonthly active case detection visits, local health clinics and centresalso performed passive case detection routinely.

Negative Control samples: Melbourne and Thai Red Cross, Melbourne BloodDonors

Three panels of control samples were collected from individuals with noknown previous exposure to malaria. The first panel was collected fromthe Volunteer Blood Donor Registry (VBDR) at the Walter and Eliza Hallof Medical Research in Melbourne, Australia. These individuals are notscreened for diseases but a record of their past travel, medical historyand current drug use is recorded. 102 volunteers from the VBDR wereutilized (median age 39 years, range 19-68). The second panel wascollected from the Australian Red Cross (Melbourne, Australia). 100samples were utilized (median age 52 years, range 18-77), and theseindividuals were screened as per the standard conditions of theAustralian Red Cross. Finally, another control panel was collected fromthe Thai Red Cross (Bangkok, Thailand). Samples from 72 individuals wereutilized, but no demographic data was available (hence the age range isunknown). Standard Thai Red Cross screening procedures excludeindividuals from donating blood if they had a past malaria infectionwith symptoms within the last three years, and individuals who havetravelled to malaria-endemic regions within the past year.

All studies (antigen discovery and validation) detected malariaparasites by quantitative PCR as previously described (2, 3).

Protein Expression.

Proteins were preferably expressed as full-length proteins, to ensurethat any possible antibody recognition site was covered. For very largeproteins, fragments were expressed that together cover the entireprotein. These were treated as individual constructs in thedown-selection process. The proteins were first produced at asmall-scale with a biotin tag for the antigen discovery study, at EhimeUniversity. A panel of 342 P. vivax proteins were assessed, includingwell-known P. vivax proteins such as potential vaccine candidates (i.e.MSP1, AMA1, CSP), orthologs of immunogenic P. falciparum proteins andproteins with a predicted signal peptide (SP) and/or 1-3 transmembranedomains (TM) (4). The genes were amplified by PCR and cloned into thepEU_E01 vector with N-terminal His-bis tag (CellFree Sciences,Matsuyama, Japan). P. vivax genes were obtained either from parentclones (4), using SAL-1 cDNA, or commercially synthesized from Genscript(Japan). The recombinant proteins were expressed without codonoptimization using the wheat germ cell-free (WGCF) system as previouslydescribed (5). WGCF synthesis of the P. vivax protein library was basedon the previously described bilayer diffusion system (6). Forbiotinylation of proteins, 500 nM D-biotin (Nacalai Tesque, Kyoto,Japan) was added to both the translation and substrate layers. CrudeWGCF expressed BirA (1 μl) was added to the translation layer. In vitrotranscription and cell-free protein synthesis for the P. vivax proteinlibrary were carried out using the GenDecoder 1000 robotic synthesizer(CellFree Sciences) as previously described (7, 8). Expression of theproteins was confirmed by western blot using HRP-conjugatedstreptavidin.

Large-scale protein expression for the down-selected candidates was thenperformed (CellFree Sciences Tokyo, Japan). Genes were synthesized byGenScript (Japan) and the products cloned into the pEU-E01-MCSexpression vector. The sequence of all gene synthesis products and theircorrect insertion into the expression vector was confirmed byfull-length sequencing of the vector inserts. Transcription wasperformed utilizing SP6 RNA polymerase (80 U/μl) and the SP6 promoter inthe pEU-E01-MCS expression vector. For large-scale expression, adialysis-based refeeding assay was used, with protein expression andsolubility first tested on a 50 scale. The test results then enabledproduction on a 3 ml scale (maintained for up to 72 hours, 15° C.) toproduce at least 300 μg of each target protein, using the wheat germextract WEPRO7240H. The proteins were manually purified one-time on anaffinity matrix (Ni Sepharose 6 Fast Flow from GE Healthcare, Chalfont,United Kingdom) using a batch method (all proteins were expressed with aHis-tag at the C terminus). The purified proteins were stored andshipped in the following buffer: 20 mM Na-phosphate pH 7.5, 0.3 M NaCl,500 mM imidazole and 10% (v/v) glycerol. Protein yields and purity weredetermined using 15% SDS page followed by Coomassie Brilliant Bluestaining using standard laboratory methods. In addition, proteins werealso analyzed by Western Blot using an anti-His-tag antibody.

An additional 25 proteins were also used in the validation study. 12proteins were produced using the wheat-germ cell free system describedabove at Ehime University, and were selected based on associations withpast exposure in preliminary work conducted in a PNG cohort. Theremaining 13 proteins were produced using standard E. coli methods, andwere selected based on their predicted high immunogenicity (due to theirstatus as potential vaccine candidates). References can be found inAppendix II.

AlphaScreen® Assay for the Antigen Discovery Study.

The AlphaScreen® assay was used to measure antibody responses in theantigen discovery study. Plasma samples from the sub-set of volunteers(n=32 Thailand, n=33 Brazil) were used from four time-points, enrollment(week 0) and weeks 12, 24 and 36. Responses were measured against 342 P.vivax proteins. The assay was conducted as previously reported (9), withslight modifications. The protocol was automated by use of the JANUSAutomated Workstation (PerkinElmer Life and Analytical Science, Boston,Mass.). Reactions were carried out in 25 μl of reaction volume per wellin 384-well OptiPlate microtiter plates (PerkinElmer). First, 0.1 μl ofthe translation mixture containing a recombinant P. vivax biotinylatedprotein was diluted 50-fold (5 μl), mixed with 10 μl of 4000-folddiluted plasma in reaction buffer (100 mM Tris-HCL [pH 8.0], 0.01% [v/v]Tween-20 and 0.1% [w/v] bovine serum albumin), and incubated for 30 minat 26° C. to form an antigen-antibody complex. Subsequently, a 10 μlsuspension of streptavidin-coated donor-beads and acceptor-beads(PerkinElmer) conjugated with protein G (Thermo Scientific, Waltham,Mass.) in the reaction buffer was added to a final concentration of 12μg/ml of both beads. The mixture was incubated at 26° C. for one hour inthe dark to allow the donor and acceptor-beads to optimally bind tobiotin and human IgG, respectively. Upon illumination of this complex, aluminescence signal at 620 rim was detected by the EnVision plate reader(PerkinElmer) and the result was expressed as AlphaScreen counts. Atranslation mixture of WGCF without template mRNA was used as a negativecontrol. Each assay plate contained a standard curve of totalbiotinylated rabbit IgG. This enabled standardisation between platesusing a 5-parameter logistic standard curve. All samples were run intriplicate. Reading the plates was conducted in a randomized manner toavoid biases.

Multiplexed Bead-Based Assay for the Validation Study.

For validation of the down-selected candidate serological markers, IgGlevels were measured in plasma collected from the last time-point of thelongitudinal observation studies. IgG measurements were performed usinga multiplexed bead-based assay as previously described (10). In brief,2.5×10⁶ COOH microspheres (Bio-Rad, USA) were prepared for proteincoupling by incubation for 20 minutes at room temperature in 100 mMmonobasic sodium phosphate (pH 6.2), 50 mg/ml N-Hydroxysulfosuccinimidesodium salt and 50 mg/ml N-(3-Dimethylaminopropyl)-N′-ethylcarbodiimidehydrochloride. Proteins were then added and incubated overnight at 4° C.Optimal amounts of protein were determined experimentally, in order toachieve a log-linear standard curve when using a positive control plasmapool generated from hyper-immune PNG donors. Each assay platesubsequently included this 2-fold serial dilution standard curve ( 1/50to 1/25600), to enable standardisation between plates.

The assay was run by incubating 50 μl of the protein-coupledmicrospheres (500 microspheres/well) with 50 μl test plasma (at 1/100dilution) in 96-well multiscreen filter plates (Millipore, USA) for 30minutes at room temperature, on a plate shaker. Plates were washed 3times and then incubated for a further 15 minutes with the detectorantibody, PE-conjugated anti-human IgG ( 1/100 dilution, JacksonImmunoResearch, USA). The plates were once again washed and then assayedon a Luminex 200™ instrument. All median fluorescent intensity (MFI)values were converted to relative antibody unites using theplate-specific standard curve (five-parameter logistic function, aspreviously described in detail (10)).

Statistical Modelling.

The models are described in greater detail below (see Example 3).

Statistical Analysis.

All data manipulation and statistical analyses were performed in eitherR version 3.2.3 (11), Prism version 6 (GraphPad, USA) or Stata version12.1 (StataCorp, USA).

Results Down-Selection of Candidate Serological Markers.

The data were processed and candidate serological markers down-selectedfollowing the pipeline shown in FIG. 2. The raw AlphaScreen data wasconverted based on the plate-specific standard curve, resulting inrelative antibody units ranging from 0-400. Using the converted data,seropositivity was defined as a relative antibody unit greater than 0.For proteins that were defined as immunoreactive (more than 10%individuals seropositive at baseline, time of P. vivax infection), anestimated antibody half-life was determined using a mixed-effects linearregression model, described in detail below (see Statistical modelling).Using the metadata on immunoreactivity and half-life, an initial roundof antigen down-selection was performed, with prioritisation of antigensthat had similar estimated half-lives in both the Thai and Braziliandatasets (neither site more than double the other), high levels ofseropositivity at baseline (more than 50% individuals seropositive, i.e.relative antibody units above 0), and low levels of error estimated inthe model. Three rounds of initial down-selection were performed,resulting in approximately 100 antigens for the next round ofmodel-based down-selection.

The model-based down-selection was performed in two stages: first, bycalculating the estimated time since last infection based on antibodylevels at 0, 3, 6 and 9 months (and comparing this with the known timesince infection), and second, by determining the best combination ofantigens for accurately detecting the time since last infection.

In more detail, FIG. 2 shows a pipeline for down-selection of candidateserological markers. As shown in the process of FIG. 2A, antigens werefirst down-selected based on prioritization of metadata characteristicssuch as similar levels of estimated antibody longevity in Thailand andBrazil, high levels of immunogenicity at the time of infection and lowlevels of error estimated in the model. As shown in the process of FIG.2B, using the initial down-selected antigens, further modelling wasperformed to identify the optimal combination of antigens able toaccurately estimate the time since last infection. A final decision oncandidate serological markers was made using output from this modellingand other protein characteristics, as detailed above.

As expected, different antibody kinetic profiles over 9-months wereobserved for different proteins (see FIG. 3 for an example). Antigendown-selection was performed as described in detail in the Materials andMethods, essentially by prioritizing antigens with high levels ofimmunogenicity, similar estimated half-lives between Thailand and Braziland low levels of error estimated in the model. The initialdown-selection was followed by further model-based down selection. Themodel-based down-selection was used to determine the ability of variousproteins to predict the time since last infection, utilizing the samedatasets from Thailand and Brazil, and to determine the best combinationof proteins to do so successfully (see for example FIG. 20 and itsaccompanying description). Antigens were excluded from selection if theyhad less than a 40% probability of inclusion in a 40-antigen panel thatwas able to accurately determine the time since last infection.Remaining antigens were then ranked according to a high probability ofinclusion in a successful 20-antigen panel. When required, ranking in 30and 40-antigen panels was also considered. Antigens were excluded ifthey had unfavorable protein production characteristics, such aslow-yield in the small-scale WGCF expression or presence of aggregates.Three rounds of selection were performed: the first resulted in 12proteins, the second in a further 12, and the third in an additional 31candidates. A final list of 55 protein constructs (53 unique proteins)representing candidate serological markers of recent exposure to P.vivax infection was generated (two fragments were included from twodifferent antigens). Characteristics of these proteins are highlightedin FIG. 4.

Validation of Candidate Serological Markers.

Geographical validation (that is validation across different regions)was performed as follows.

The down-selected markers were chosen based on antibody data fromindividuals in Thailand, Brazil and the Solomon Islands, three discretegeographical areas. Despite this, there was a strong correlation betweenthe antibody responses measured, in terms of both immunogenicity(seropositivity rates) and antibody level at time of P. vivax infection,as well as the estimated antibody half-lives calculated from consecutivetime-points. This is shown in FIG. 5.

Validation in association with recent and past infection was performedas well.

The Luminex® bead-array assay has been successfully established for 40of the 55 proteins identified in the antigen discovery study (FIG. 6) aswell as for the additional 25 proteins (65 total). Plasma samples fromthree observational cohorts (final time-point) have been screenedagainst these 65 proteins, Thai (n=829), Brazilian (n=925) and SolomonIslands (n=751), in addition to 3 sets of non-exposed (malaria) controls(two panels from Australia and one panel from Thailand). An example ofthe responses in these cohorts, with relation to time since lastinfection, to 4 of 65 proteins is shown in FIG. 22, described withregard to Example 4 below.

In the Thai cohort, antibody levels measured to all 17 proteins,selected for performing the first set of tests, were strongly associatedwith the presence of current P. vivax infections (logistic regressionmodel, odds ratios of 2.8-5.4, p<0.05) (FIG. 7). In addition, antibodylevels to 16 of 17 proteins at the last visit of the cohort study werepositively and significantly associated with past exposure to P. vivaxinfections based on PCR results during the yearlong assessment period(measured as the molecular force of blood-stage infection, (molFOI)(generalised linear model, exponentiated coefficients of 1.03-1.18,p<0.05) (FIG. 8). The exception was for PVX_090970, exponentiatedcoefficient 1.03, p=0.073.

In the Brazilian cohort, the effect size, overall, was not as great asfor Thailand. Nevertheless, antibody levels to 16 of 17 proteins werestrongly associated with the presence of current P. vivax infections(logistic regression model, odds ratios of 1.59-3.04, p<0.05) (FIG. 9).The exception was for PVX_088860, with an odds ratio of 1.33 (p=0.21).Antibody levels to 10 of 17 proteins at the last visit of the cohortwere positively and significantly associated with past exposure to P.vivax (molFOI) (generalised linear model, exponentiated coefficients of1.04-1.18, p<0.05) (FIG. 10). Of the antibodies with estimated ‘short’half-lives (less than 6 months), there was one exception, PVX_088860,with an exponentiated coefficient of 1.03 (p=0.24). Of the antibodieswith estimated ‘long’ half-lives (more than 6 months), 6 of 10 were notassociated with past exposure (exponentiated coefficients of 1.02-1.04,p>0.05).

Various statistical methods can be used to test the association betweenantibody level to certain proteins and past (recent) or current exposureto P. vivax infections. For most proteins, there was a clear significantassociation with both past and current P. vivax infections, which ispromising for the use of these antigens as serological markers. Forothers, there was a trend towards an association, which did not reachsignificance. In a final test, it will be an antibody signature that isused for classification of recent infection, made up of antibodyresponses to a multitude of proteins. Therefore the lack of significancefor some individual proteins does not imply that they will not be usefulin the final classification algorithm.

These analyses show that 16 of 17 proteins generate antibodies that arestrongly associated with both current infections and 10 of 17 with pastP. vivax exposure in both Thailand and Brazil, demonstrating that amajority of these antigens have the potential to detect both concurrentand recent past P. vivax infections.

REFERENCES

-   1. Longley R J, Reyes-Sandoval A, Montoya-Diaz E, Dunachie S,    Kumpitak C, Nguitragool W, Mueller I, Sattabongkot J. 2015.    Acquisition and longevity of antibodies to Plasmodium vivax    pre-erythrocytic antigens in western Thailand. Clin Vaccine Immunol    doi:10.1128/cvi.00501-15.-   2. Wampfler R, Mwingira F, Javati S, Robinson L, Betuela I, Siba P,    Beck HP, Mueller I, Felger I. 2013. Strategies for detection of    Plasmodium species gametocytes. PLoS One 8:e76316.-   3. Rosanas-Urgell A, Mueller D, Betuela I, Barnadas C, Iga J,    Zimmerman P A, del Portillo H A, Siba P, Mueller I, Felger I. 2010.    Comparison of diagnostic methods for the detection and    quantification of the four sympatric Plasmodium species in field    samples from Papua New Guinea. Malar J 9:361.-   4. Lu F, Li J, Wang B, Cheng V, Kong D H, Cui L, Ha K S,    Sattabongkot J, Tsuboi T, Han E T. 2014. Profiling the humoral    immune responses to Plasmodium vivax infection and identification of    candidate immunogenic rhoptry-associated membrane antigen (RAMA). J    Proteomics 102:66-82.-   5. Sawasaki T, Ogasawara T, Morishita R, Endo Y. 2002. A cell-free    protein synthesis system for high-throughput proteomics. Proc Natl    Acad Sci USA 99:14652-14657.-   6. Sawasaki T, Hasegawa Y, Tsuchimochi M, Kamura N, Ogasawara T,    Kuroita T, Endo Y. 2002. A bilayer cell-free protein synthesis    system for high-throughput screening of gene products. FEBS Lett    514:102-105.-   7. Sawasaki T, Morishita R, Gouda M D, Endo Y. 2007. Methods for    high-throughput materialization of genetic information based on    wheat germ cell-free expression system. Methods Mol Biol 375:95-106.-   8. Sawasaki T, Gouda M D, Kawasaki T, Tsuboi T, Tozawa Y, Takai K,    Endo Y. 2005. The wheat germ cell-free expression system: methods    for high-throughput materialization of genetic information. Methods    Mol Biol 310:131-144.-   9. Matsuoka K, Komori H, Nose M, Endo Y, Sawasaki T. 2010, Simple    screening method for autoantigen proteins using the N-terminal    biotinylated protein library produced by wheat cell-free synthesis.    J Proteome Res 9:4264-4273.-   10. Franca C T, Hostetler J B, Sharma S, White M T, Lin E, Kiniboro    B, Waltmann A, Darcy A W, Li Wai Shen C S, Siba P, King C L, Rayner    J C, Fairhurst R M, Mueller I. 2016. An Antibody Screen of a    Plasmodium vivax Antigen Library Identifies Novel Merozoite Proteins    Associated with Clinical Protection. PLoS Negl Trop Dis 10:e0004639.-   11. Team R C. 2015. R: A language and environment for statistical    computing, R Foundation for Statistical Computing, Vienna, Austria.    https://www/R-project.org/.

EXAMPLE 2 Illustrative Diagnostic Test

A diagnostic test according to at least some embodiments of the presentinvention could optionally include any of bead-based assays previouslydescribed (AlphaScreen® assay and multiplexed assay using Luminex®technology).

In addition to the ability to measure antibody responses using thebead-based assays previously described, other methods could also beused, including, but not limited to, the enzyme linked immunosorbentassay (ELISA) (1). protein microarray (2) and the luminescenceimmunoprecipitation system (LIPs) (3).

Antibody measurements via ELISA rely on coating of specialised plateswith the required antigen, followed by incubation with the plasma sampleof interest. IgG levels are detected by incubation with a conjugatedsecondary antibody followed by substrate, for example a horseradishperoxidase-conjugated anti-IgG and ABTS[2,2=-azinobis(3-ethylbenzothiazo-line-6-sulfonic acid)-diammoniumsalt].

Protein microarray platforms offer a high-throughput system formeasuring antibody responses. Proteins of interest are spotted ontomicroarray chips then probed with plasma samples. The arrays are thenfurther incubated with a labeled anti-immunoglobulin and analysed usinga microarray scanner.

The LIPs assay utilizes cell lysate containing the expressed antigenfused to a Renilla luciferase reporter protein. Plasma samples areincubated with a defined amount of this lysate, with protein A/G beadsused to capture the antibody. The amount of antibody-boundantigen-luciferase is measured by the addition of a coelenterazinesubstrate, and the light emitted measured using a luminometer.

Any of these assays may optionally be combined with a reader and ifnecessary, an analyzer device, to form an apparatus according to atleast some embodiments of the present invention. The reader would readthe test results and the analyzer would then analyze them according toany of the previously described algorithms and software.

REFERENCES

-   1. Longley R J, Reyes-Sandoval A, Montoya-Diaz E, Dunachie S,    Kumpitak C, Nguitragool W, Mueller I, Sattabongkot J. 2015.    Acquisition and longevity of antibodies to Plasmodium vivax    pre-erythrocytic antigens in western Thailand. Clin Vaccine Immunol    doi:10.1128/cvi.00501-15.-   2. Finney O C, Dauziger S A, Molina D M, Vignali M, Takagi A, Ji M,    Stanisic D I, Siba P M, Liang X, Aitchison J D, Mueller I, Gardner M    J, Wang R. 2014. Predicting anti-disease immunity using proteome    arrays and sera from children naturally exposed to malaria. Mol Cell    Proteomics doi:10.1074/mcp.M113.036632.-   3. Longley R J, Salman A M, Cottingham M G, Ewer K, Janse C J, Khan    S M, Spencer A J, Hill A V, 2015. Comparative assessment of vaccine    vectors encoding ten malaria antigens identifies two protective    liver-stage candidates. Sci Rep 5:11820.

EXAMPLE 3 Illustrative Software Process for Diagnosis

This Examples relates to processes for estimation of time since last P.vivax infection using measurements of antibody titers, which mayoptionally be provided through software.

Section 1 relates to calibration and validation of the input data, aswell as non-limiting examples of models and algorithms which mayoptionally be used to analyze the data. Section 2 provides additionalinformation on the algorithms utilized.

Section 1—Overview of Calibration Data and Algorithms Calibration andValidation Data

Both the down-selection of antigens for incorporation into a diagnostictest, and the calibration and validation of algorithms for providingclassifications of recent P. vivax infection given blood samples, willdepend on the available epidemiological data. Data will be required onthe demography of the populations under investigation, serologicalmeasurements, and monitoring for parasitemia and clinical episodes.Table 1 provides an overview of the data sets that are used.

Algorithm Inputs and Outputs

A diagnostic test will take a blood sample as input and provide data toinform a decision making process as output. The type of data generatedwill depend on the technological specifications of the diagnosticplatform. The outputted data can then be used as input for somealgorithm to inform a decision making process. The following factorsneed to be taken into consideration when defining the inputs and outputsof a decision making algorithm:

1) Number of Antigens

The number of antigens to which antibodies can be measured will berestricted by the technological specifications of the diagnosticplatform under consideration. Measurement of antigens to a greaternumber of antibodies will in theory provide more data as input for analgorithm, potentially increasing predictive power.

TABLE 1 Overview of data sets used for antigen down-selection andalgorithm calibration and validation. demographic data serological dataparasitological data number of samples samples PCR region number ageantigens per person platform per person positive clinical Antigendown-selection Thailand 32  29 (7, 71) 342 4 AlphaScreen 17 enrolmentenrolment Brazil 33 342 4 AlphaScreen 17 enrolment enrolment Algorithmcalibration and validation Thailand 829  25 (2, 79)  65 1 Luminex 14 97/829 25/829 Brazil 928  25 (0, 102)  65 1 Luminex 13 236/928 80/928Solomon 860 5.5 (0.5, 12.7)  65 1 Luminex 11 294/860 35/860 IslandsNegative controls Australian 100  52 (18, 77)  65 1 Luminex  1 no no RedCross Thai Red 72  65 1 Luminex  1 no no Cross Australian 102  39 (19,68)  65 1 Luminex  1 no no donors

2) Measurement of Antibody Levels

The levels of antibody in a blood-sample can be measured and summarisedin a variety of ways.

a) Continuous Measurement

-   -   A continuous measurement that has a monotonic relationship with        antibody titer. It can be compared with a titration curve to        produce an estimate of antibody titer,

b) Binary Classification

-   -   Assesses whether antibody levels are greater or less than some        threshold.

c) Categorical Classification

-   -   Assigns antibody levels to one of a set of pre-defined        categories, e.g. low, medium, high. A categorical classification        can be generated via a series of binary classifications.

3) Decision Making Requirements

The result of a diagnostic test and accompanying algorithm can be usedto inform a decision on whether or not to treat, as well as to informsurveillance systems.

a) Classification of Recent Infection

-   -   A binary output corresponding to whether or not there was an        infection with P. vivax blood-stage parasites in the past 9        months. This can be presented as a binary classification, or as        a probabilistic classification. This can be adjusted for a range        of different temporal thresholds: 3 months, 6 months, 12 months,        18 months.

b) Estimation of Time Since Last Injection

-   -   An estimate of the time since last P. vivax blood-stage        infection—depending on the available calibration data the time        since last infection can be defined either as the time since        last PCR-detectable blood-stage parasitemia, or as the time        since last mosquito bite. Time since last infection can be        estimated continuously or categorically. Concurrent estimation        of uncertainty will be important.)

c) Medium-Term Serological Exposure

Given sufficient calibration data, the algorithms described here can bemodified to provide extended measurements of an individual's recent tomedium term P. vivax exposure, e.g. how many infections in the last 2years?

4) Computational and Analytic Capabilities

An algorithm's complexity will be restricted by the analytic resourcesaccompanying the diagnostic platform. In a low resource setting, we mayrequire a decision to be made given a sequence of binary outputs from arapid diagnostic test (sero-negative or sero-positive) without anyaccess to computational devices. At the other extreme, in a highresource setting we may have continuous measurements of antibodies tomultiple antigens accompanied with algorithms encoded in computationalsoftware.

-   -   a) No access to computational devices. Algorithms implemented        via ‘easy to follow’ instructions on paper charts.    -   b) Algorithm implemented in software that can be installed on a        portable computation device such as a smartphone or tablet. May        require the manual entry of output from the diagnostic platform.    -   c) Computational software with encoded algorithms integrated        within the diagnostic platform.

Algorithms

There is a wide range of algorithms for classification and regression inthe statistical inference and machine learning literature (Hastie,Tibshirani & Friedman). A classification algorithm can take a diverserange of input data and provide some binary or categoricalclassification as output. A regression algorithm can take similar input,but provides a continuous prediction as output. Table 2 provides anoverview of some algorithms that can be used for classificationproblems. Four of these have been regularly described in the statisticallearning literature: linear discriminant analysis (LDA); quadraticdiscriminant analysis (QDA); decision trees; and random forests. One ofthese has been specifically developed for the application at hand:combined antibody dynamics (CAD). The candidate algorithms areclassified according to a number of factors. The degree of transparencydescribes the straightforwardness and reproducibility of an algorithm. Adecision tree is considered very transparent as it can be followed by amoderately well-informed individual, as it requires answering a sequenceof questions in response to measured data. This simple, logicalstructure makes decision trees particularly popular with doctors.Because of the transparency and ease of use, decision trees aresometimes referred to as glass box algorithms. At the other extreme,algorithms such as random forests are considered to be black boxalgorithms where there may be no obvious association between the inputsand outputs.

TABLE 2 Overview of algorithms suitable for classification of recent P.vivax infection or estimation of time since last P. vivax infection.algorithm data needs transparent stochastic time predicted commentslinear continuous + no no The assumption of discriminant commoncovariance for analysis each category may be too (LDA) restrictive.quadratic continuous + no no; categorical There is an approximatediscriminant estimation equivalence between the analysis possible, QDAclassification space (QDA) incorporation of and that predicted by theuncertainty CAD algorithm. challenging decision binary +++ no no;possible via Very transparent and trees regression trees or simple toimplement in categorical low technology settings. estimation randomcontinuous −− yes no; possible via Potentially very powerful forestsregression trees or but requires considerable categorical computationalresources. estimation combined continuous ++ no yes; with A biologicallymotivated antibody uncertainty representation of dynamics antibodiesfollowing (CAD) infection; prior information on decay rates can beincorporated.

Section 2—Expanded Details of Algorithms

Here we provide an overview of classification algorithms such as LDA,CODA, decision trees and random forests which have already beendescribed extensively elsewhere (Hastie, Tibshirani & Friedman³). Wealso provide an extended description of the combined antibody dynamics(CAD) algorithm.

Linear and Quadratic Discriminant Analysis

The theory of linear discriminant analysis (LDA) and quadraticdiscriminant analysis (QUA) is described in detail in “The Elements ofStatistical Learning: Data Mining, Inference and Prediction” by Hastie,Tibshirani &. Friedman⁶. We provide a brief overview of how thesemethods may be applied. A key assumption for LDA and QDA classificationalgorithms is that individuals who have similar antibody titers arelikely to have the same classification. It is convenient to compareindividuals with different antibody profiles via Euclidean distance oflog antibody titers. An LDA or QDA classifier can be implemented byfitting multivariate Gaussian distributions to the clusters of datapoints representing ‘old’ and ‘new’ infections. Assume we havemeasurements of p antibodies. Denote k ∈ {new,old} to represent theclasses of training individuals with new and old infections. These canbe modelled as multivariate Gaussians:

${f_{k}(x)} = {\frac{1}{( {2\;\pi} )^{p/2}{\Sigma_{k}}^{1/2}}e^{{- \frac{1}{2}}{({x - \mu_{k}})}^{\tau}{\Sigma_{k}^{- 1}{({x - \mu_{k}})}}}}$

where μ_(k) and Σ_(k) are the mean and p*p covariance matrix of thetraining data of each class. In the case of LDA, all classes are assumedto have the same covariance matrix (Σ_(new)=Σ_(old)=Σ), and theclassification between new and old infections can be evaluated via thelog ratio:

${\log( \frac{P( {{{new}\text{|}X} = x} )}{P( {{{old}\text{|}X} = x} )} )} = {{{- \frac{1}{2}}( {\mu_{new} + \mu_{old}} )^{T}{\Sigma^{- 1}( {\mu_{new} - \mu_{old}} )}} + {x^{T}{\Sigma^{- 1}( {\mu_{new} - \mu_{old}} )}}}$

which is linear in x. The two categories are therefore separated by ahyperplane in p-dimensional space.

In QDA, the restriction that Σ_(new)=Σ_(old)=Σ is relaxed and it can beshown that the classification boundary is described by a conic sectionin p-dimensional space.

LDA and QDA have consistently been shown to provide robustclassification for a wide range of problems. The predictive power ofthese algorithms can be assessed through cross-validation whereby thedata is split into training and testing data sets. The algorithm iscalibrated using the training data set and subsequently validated usingthe test data set. An important method for assessing an algorithm'spredictive power is to evaluate the sensitivity and specificity. In thiscontext, we define sensitivity to be the proportion of recent infectionscorrectly classified as recent infections, and we define specificity tobe the proportion of old infections correctly classified as oldinfections.

A receiver operating characteristic (ROC) curve allows for detailedinvestigation of the association between sensitivity and specificity. Atone extreme, we could obtain 100% sensitivity and 0% specificity bysimply classifying all blood samples as new infections. At the otherextreme, we could obtain 100% specificity and 0% sensitivity byclassifying all blood samples as old infections. FIG. 25 shows ROCcurves describing the classification performance of IDA algorithms forcombinations of 4 antigens in Thailand, Brazil and the Solomon Islands.

Decision Trees and Random Forests

Tree-based algorithms partition the space spanned by the data into a setof rectangles with a unique classification applied to each rectangle.Similarly to the LDA and QDA classification algorithms, a great deal oftheoretical information is supplied in the book “The Elements ofStatistical Learning: Data Mining, Inference and Prediction”.

There are several powerful methods for extending decision treeclassifiers including bagging (bootstrapp aggregating), boosting andrandom forests³. These methods can lead to substantially improvedclassifiers but typically require more computation and more data. Inaddition to providing powerful classifiers, these algorithms can provideimportant diagnostics for investigating the association between thesignal in the input and the output.

FIG. 23 A-C shows the ROC curves for cross-validated random forestsclassifiers applied to data sets from Thailand, Brazil and SolomonIslands. Notably, when the training and testing data sets are from thesame region, there are many combinations of four antigens that allowsensitivity >80% and specificity >80%. When training and testing datasets are from different regions, it was still possible to obtaincombinations of four antigens with sensitivity >80% and specificity>80%.

Modelling of Antibody Dynamics

A key premise of the proposed diagnostic test is that followinginfection with P. vivax blood-stage parasites, an antibody response willbe generated that will change predictably over time (FIG. 13). Here wepresent a subset of the data that demonstrates how antibodies to P.vivax antigens change over time.

Longitudinal Antibody Titers Following Clinical P. vivax

We have data from longitudinal cohorts in Thailand and Brazil whereparticipants were followed for up to 36 weeks after a symptomaticclinical episode of P. vivax (see also Table 1/Materials and Methods inExample 1, antigen discovery cohorts). Participants were treated withprimaquine, and blood samples were frequently tested to ensure theyremained free from re-infection. Antibody levels to a wide range ofantigens were measured at 12 week intervals to investigate the changingantibody dynamics. The sample data in FIG. 11 illustrates thatantibodies exhibit a range of different half-lives—a pattern consistentwith the rest of the data (see also FIG. 3). Another important generalfeature of the data is exhibited here: rapidly decaying antibodies(short half-life) exhibit much more measurement error than slowlydecaying antibodies (long-lived half-life).

The decay of anti-malaria antibodies following infection can bedescribed by an exponential or a bi-phasic exponential distribution⁴.Because of the sampling frequency (every 12 weeks) we assume thatantibodies decay exponentially. Exponential decay equates to lineardecay on a log scale. Therefore we utilise linear regression models. Inparticular, we utilise a mixed-effects linear regression framework sothat we can estimate both the mean rate of antibody decay as well as thestandard deviation.

We assume that for individual i we have measurements of antibody titerA_(ijk) at time j to antigen k. We assume that at time 0, antibodytiters are Normally distributed⁵ with mean α_(k) ⁰ and standarddeviation σ_(α,k) on a log-scale. We assume that an individual's rate ofantibody decay is drawn from a Normal distribution with mean r_(k) ⁰ andstandard deviation σ_(r,k). The antibody dynamics in the population cantherefore be described by the following mixed-effects linear regressionmodel:

log(A_(ijk))˜(α_(k) ⁰+α_(ik))+(r_(k) ⁰+r_(ik))t_(j)+ε_(k)

α_(ik)˜N(0, σ_(α,k))

r_(ik)˜N(0, σ_(r,k))

ε_(k)˜N(0, σ_(m,k))   (1)

This model can be fitted to data using the liner package in R. FIG. 11shows a sample of the fitted profiles of antibody dynamics.

Estimation Using Antibodies to a Single Antigen

Here we describe an algorithm that uses a biologically-motivated modelof the decay of antibody titers over time to facilitate statisticalinference of the time since last infection. A key requirement of thisalgorithm is that it requires some prior knowledge of the decay rates ofantibodies. This can be achieved either through estimation of antibodydecay rates from longitudinal data as described in equation (1), orestimation of decay rates from cross-sectional antibody measurements aspresented in FIG. 12.

The linear regression model for the decay of antibody titers describedin equation (1) has three sources of variation: (i) variation in initialantibody titer following infection; (ii) between individual variation inantibody decay rate; and (iii) measurement error. Notably, all thesesources of variations are described by Normal distributions (FIG. 13a )so their combined variation will also be described by a Normaldistribution. Therefore, the expected log antibody titer to antigen k inindividual i at timet can be described by the following distribution.

x_(ik)˜N(α_(k) ⁰+r_(k)t, σ_(α,k) ²+t²σ_(r,k) ²+σ_(m,k) ²)   (2)

The probability distribution of the expected antibody titer to antigen kin individual i at time t is given by the following distribution:

$\begin{matrix}{{P( {x_{ik}\text{|}t} )} = {\frac{1}{\sqrt{2\;{\pi( {\sigma_{\alpha,k}^{2} + {t^{2}\sigma_{r,k}^{2}} + \sigma_{m,k}^{2}} )}}}e^{- \frac{{({x_{ik} - \alpha_{k}^{0} - {r_{k}^{0}t}})}^{2}}{2{({\sigma_{\alpha,k}^{2} + {t^{2}\sigma_{r,k}^{2}} + \sigma_{m,k}^{2}})}}}}} & (3)\end{matrix}$

Note that we have x_(ik) ∈ (−∞, +∞), as x_(ik) denotes the log antibodytiter and measurements of antibody titer are assumed to be positive. Theprobability distribution for the time since infection t given measuredantibody titer x_(ik) can be calculated by inverting equation (3) usingBayes rule³.

$\begin{matrix}{{P( {t\text{|}x_{ik}} )} = \frac{{P( {x_{ik}\text{|}t} )}{P(t)}}{P( x_{ik} )}} & (4)\end{matrix}$

The time since last infection will have a lower bound of zero. We canchoose to impose an upper bound of either the individual's age ‘a’ orpositive infinity. Choosing positive infinity allows us to better handlethe case where an individual was never infected—the low measuredantibody titers will be consistent with a very large time since lastinfection, possibly greater than the age of the individual. Therefore weshould only restrict t to the interval (0, a) if we know for certainthat the individual has been infected. In practice, we choose some largetime t_(max) for our upper bound. We assume P(t) denotes a uniformdistribution on the interval (0, t_(max)). P(x_(ik)) is a normalisingconstant which is calculated via numerical integration to ensure thatP(t|x_(ik)) denotes a probability distribution.

Equation (4) provides a probability distribution for the time since lastinfection. For the purposes of a diagnostic test we may be moreinterested in obtaining a binary classification, e.g. was the individualinfected within the last 9 months. It is usually not possible todefinitively make such a categorisation, but we can instead calculatetheir probabilities as follows:

P _(0-9m)(x _(ik))=∫₀ ⁹ P(t|x _(ik))dt

P _(9m+)(x _(ik))=∫₉ ^(t) ^(max) P(t|x _(ik))dt   (5)

Combined Antibody Dynamics: Estimation Using Antibodies to MultipleAntigens

Previously, we described how the antibody titer to a single antigen canbe used to estimate the time since last infection. However, in practicethere is too much noise to make an accurate estimate of time since lastinfection with a single antigen. Increasing the number of measuredantibodies can increase the information content in our data allowing usto obtain more accurate estimates of time since last infection. Inparticular, selecting antibodies with a range of half-lives may increaseour power to resolve infection times more accurately.

FIG. 14 shows a schematic of the dynamics of antibodies to two antigens.We have rapidly decaying antibody 1 and slowly decaying antibody 2. Atbaseline, antibody titers are likely to be correlated, so we assume thatinitial titer following infection is described by a multivariate Normaldistribution with covariance matrix Σ_(α). The between individual ratesof antibody decay may also be correlated (i.e. all antibody titers maydecay particularly quickly in some individuals) so we also assume thatdecay rates are described by a multivariate Normal distribution withcovariance matrix Σ_(r). Finally, there will be measurement errorassociated with each antibody. In particular, we assume the measurementerrors between different antibodies are independent so that the totalmeasurement error can be described by a multivariate Norma distributionwith diagonal covariance matrix Σ_(m).

$\begin{matrix}{{P( {x_{i}\text{|}t} )} = {{( {2\;\pi} )^{- \frac{k}{2}}\text{}\Sigma_{\alpha}} + {t^{2}\Sigma_{r}} + {\Sigma_{m}\text{}^{- \frac{1}{2}}e^{{- \frac{1}{2}}{({x_{i} - \alpha^{0} - {r^{0}t}})}^{T}{({\Sigma_{\alpha} + {t^{2}\Sigma_{r}} + \Sigma_{m}})}^{- 1}{({x_{i} - \alpha^{0} - {r^{0}t}})}}}}} & (6)\end{matrix}$

The method for estimating the time since last infection given themultivariate probability distribution for the measured vector ofantibody titers x_(i) is the same as described in equation (4).

Selecting Optimal Combinations of Antigens

Machine learning algorithms take data from a large number of streams andidentify which data streams have the most signal for classifying output.Such methods typically involve a greedy algorithm which will provide agood but not necessarily optimal solution. Greedy algorithms take thenext best step, i.e. including the next antigen that gives the biggestincrease in predictive power. As such they may provide a locally optimalsolution but not necessarily a globally optimal solution. Simulatedannealing algorithms provide an alternative to greedy algorithms thatprovide a higher likelihood of obtaining a globally optimal solution⁷.

Here we describe how a simulated annealing algorithm can be applied tothe combined antibody dynamics (CAD) classifier to select a combinationof antigens that provides optimal predictive power. Assume that Pmeasurements of antibodies are available. We want to select some subsetof these that maximises predictive power. Denote y to be a vector of 0'sand 1's indicating whether the p^(th) antibody is included in our panel.Thus for example we may have

y=(0, 0, 1, 1, 0, 1, 0, 0, 1)   (7)

The vector of binary states depicted in equation (7) will correspond toa vector of antibody measurements as follows:

x _(i)=(x _(i,1) , x _(i,2) , x _(i,3) , x _(i,4))   (8)

Given data from I individuals on measured antibody responses, we cancalculate the probability that the individual was infected within thelast 9 months P_(0-9m)(x_(i)) or greater than 9 months agoP_(9m+)(x_(i)). Let z_(i) be an indicator denoting whether individual Iwas infected in the last 9 months (z_(i)=1) or not (z_(i)=0). We canthen write down the likelihood of the data as follows:

$\begin{matrix}{{L(y)} = {\prod\limits_{i = 1}^{I}\;{{P_{0 - {9\; m}}( x_{i} )}^{z_{i}}{P_{{9\; m} +}( x_{i} )}^{1 - z_{i}}}}} & (9)\end{matrix}$

The challenge is to select a binary vector); corresponding to acombination of antigens that maximises the likelihood in equation (9)and thus has the highest likelihood of correctly classifying infectionsaccording to whether they occurred in the last 9 months.

If we have P antigens, there are 2^(P) combinations of antigens. ForP>15 it is not computationally feasible to test all possiblecombinations. We therefore utilise a simulated annealing algorithm forexploring the state space of combinations and identifying the optimalcombinations subject to various constraints (e.g. enforcing a ma.ximumof 10 antigens to a panel). FIG. 20 shows the results, and thiscontributed to the initial down-selected of antigens as described inExample 1.

REFERENCES

-   1 White, N. J. Determinants of relapse periodicity in Plasmodium    vivax malaria. Malaria Journal 10, doi:29710.1186/1475-2875-10-297    (2011).-   2 Mueller, I. et al. Key gaps in the knowledge of Plasmodium vivax,    a neglected human malaria parasite. Lancet Infectious Diseases 9,    555-566 (2009).-   3 Hastie, T., Tibshirani, R. & Friedman, J. The elements of    statistical learning: Data mining, inference, and prediction. Second    edn, (Springer, 2009).-   4 White, M. T. et al. Dynamics of the Antibody Response to    Plasmodium falciparum Infection in African Children. Journal of    Infectious Diseases 210, 1115-1122, doi:10.1093/infdis/jiu219    (2014).-   5 Yman, V. et al. Antibody acquisition models: A new tool for    serological surveillance of malaria transmission intensity.    Scientific Reports 6, doi:10.1038/srep19472 (2016).-   6 The Elements of Statistical Learning: Data Mining, Inference and    Prediction” by Hastie, Tibshirani & Friedman; 2001, Springer.-   7 Kirkpatrick, S., Gelatt Jr, C. D. & Vecchi, M. P. Optimization by    simulated annealing. Science 220, 671-680 (1983).

EXAMPLE 4 Additional Testing of Antigens

This non-limiting Example relates to additional testing of antibodyresponses to various P. vivax proteins, present in the blood, aspotential antigens for a diagnostic test. It further relates toselection of Plasmodium vivax antigens for classification of sampleswith past blood-stage infections.

The blood collection and laboratory work was generally performedaccording to the materials and methods described in Example 1.

Overview of Epidemiological Cohorts

Data was obtained from longitudinal cohorts in three different regionsof the P. vivax endemic world. In each cohort, approximately 1,000individuals were followed over time for approximately 1 year, withactive case detection samples taken every month. These samples weresupplemented by passive case detection samples from individualsexperiencing clinical episodes of P. vivax or P. falciparum. An overviewof the data collected is shown in Table 3, and age-stratified prevalenceof PCR detectable blood-stage infection within the last 9 months isshown in FIG. 21.

In addition data was obtained from three cohorts of negative controlswho were highly to have ever been exposed to malaria. These cohortsconsisted of 102 individuals from the Victorian Blood Donor Registry(VBDR), 100 individuals from the Australian Red Cross, and 72individuals from the Thai Red Cross (residents of Bangkok with noreported history of malaria).

TABLE 3 Epidemiological overview of cohorts analysed for the associationbetween P. vivax antibody titers and time since last PCR detectableinfection. Number of samples per individual and age are shown as medianwith range. Solomon Thailand Brazil Islands number of 829 928 860individuals samples per 14 (4, 18) 13 (4, 16) 10 (6, 11) individualFemale 454 (54.8%) 471 (50.7%) 416 (48.4%) age (years) 24 (1, 78) 25 (0,103) 5.5 (0.5, 12.7) PCR infection 97 (11.7%) 236 (25.4%) 294 (34.2%)during study PCR infection in 72 (8.7%) 205 (22.1%) 265 (30.8%) last 9months PCR infection in 44 (5.3%) 119 (12.8%) 156 (18.1%) last 3 monthsPCR infection at 25 (3.0%) 40 (4.3%) 93 (10.8%) last final time point

Measured Antibody Responses

In each of the three longitudinal cohorts, antibody responses weremeasured at the final time point to allow investigation of theassociation between antibody response and time since last infection. Theantibody responses to 65 antigens were measured. 40 of these antigenswere selected following a previously published down-selection procedurefrom a starting panel of 342 wheat-germ expressed proteins. These 40proteins were supplemented by another 25 purified P. vivax proteinsobtained from collaborators. These P. vivax antigens were coupled toCOOH micro-beads, and a multiplexed Luminex assay was used to measureMean Fluorescence Intensity (MFI) for each antigen in each sample. MFImeasurements were converted to antibody titers by calibrated tomeasurements from a hyper-immune pool of Papua New Guinean adults. FIG.22 shows the measured response from 4 of the 65 antigens, and thevariation with time since last infection.

Selection of Optimal Combinations of Antigens for Classification InitialInvestigation of Combinations of Parameters

Of the 65 P. vivax proteins considered, 5 were excluded because of poorimmunogenicity which resulted in missing data from a large proportion ofsamples. This resulted in a panel of 60 antigens for detailedinvestigation and further down-selection. The aim is to identifycombinations of up to 5 antigens that can provide accurateclassification within a single cohort, and identify combinations of 8-15antigens that can accurately across multiple cohorts with a wide rangeof transmission intensities and age ranges.

Without wishing to he limited by a single hypothesis, selectionoptimized for three classification targets:

1. Surveillance target. Select combinations of antigens such that bothsensitivity and specificity are given equal weight in optimisation. Thisis done by maximising the area under the curve (AUC) of a receiveroperating characteristic (ROC) curve.

2. Serological Screen and Treat (SSAT) target. Select combinations ofantigens that maximise sensitivity (e.g. >95%) while enforcing a lowerbound on specificity (e.g. >50%).

3. Surveillance target. Select combinations of antigens that maximisespecificity (e.g. >95%) while enforcing a lower bound on sensitivity(e.g. >50%).

The first step is to identify combinations of antigens for which thereis a strong signal enabling classification. This was done by using alinear discriminant analysis (LDA) classifier to test all combinationsof antigen of size up to 5. Above size 5, it was not computationallyfeasible to evaluate all possible combinations. Therefore for n>5,combinations of size n+1 were evaluated by identifying the optimal 500combinations of size n antigens and including all positive individually.

Optimisation of Algorithms Given Most Likely Parameter Combinations

Given a subset of n antigens, a range of classification algorithms wereconsidered: LDA, quadratic discriminant analysis (QDA), decision trees,and random forests. For a given algorithm and subset of antigensclassification performance was assessed through cross-validation. Thekey to cross-validation is to use disjoint training and testing datasets to assess classification of performance. For each cohort, this isdone by randomly selecting ⅔ of the data as the training set and testingthe algorithm on the remaining ⅓. This is repeated 200 times and theaverage of the cross-validated ROC curves is calculated.

FIGS. 23A-23C show cross-validated ROC curves for assessing theclassification performance of random forests algorithms (determinedaccording to the randomForests library in R). In cases where algorithmswere trained and tested on data from the same region, many differentcombinations of 4 antigens resulted in sensitivity and specificitygreater than 80%. Even when an algorithm was trained on data from oneregion and then tested on data from another region of the world, it wasstill possible to obtain combinations of antigens with both sensitivityand specificity greater than 80%, with the exception of algorithmstrained on data from Thailand and tested on data from the SolomonIslands.

Ranking of Antigens

Multiple factors determine whether or not an antigen will contribute toclassification of recent infection. These include but are not limitedto: antibody dynamics; immunogenicity of recent infections compared toold infections and measurements from control samples; area under the ROCcurve when considering one antigen at a time; frequency of selection intop combinations of antigens. FIG. 24 shows a network visualisation ofhow combinations of 4 antigens are selected. The size of each noderepresents the likelihood that an antigen is selected, and the width andcolour of an edge represents the probability that a pair of antigens areselected in combination. Therefore, the most commonly selected antigensare biggest and cluster in the centre of the network. There was a highdegree of consistency in the antigens that were selected in each of thethree cohorts, with the most strongly identified antigens being RBP2b(V3), L01, L31, X087885 (X7), PvEBP (V11), L55, PvRipr (V8) and L54.

Table 4 shows a ranking of antigens according to a range of criteria.The top two antigens, RBP2b and L01, are preferred candidates. The nextsix antigens are likely candidates. The next seven antigens are possiblecandidates. Also included are an additional nine antigens worth furtherconsideration.

TABLE 4 List of antigens ranked according to their contribution toclassification of individuals with PCR detectable blood-stage P. vivaxin the last 9 months. The area under the curve (AUC) is based on usingantibody titers to a single antigen for classification. Combinations ofantigens were investigated by assessing classification performance oflinear discriminant analysis (LDA) for all combination of 4 antigensfrom the initial panel of 60 antigens. Recent infection sero-positivityshows the proportion of individuals with PCR detectable P. vivax in thelast 9 months, with the threshold of sero-positivity defined as thegeometric mean titer (GMT) plus two standard deviations of the negativecontrols. Area Under Curve Top 1% of combination Recent infection (1antigen) (4 antigens) sero-positivity antigen Thailand Brazil SolomonsThailand Brazil Solomons Thailand Brazil Solomons RBP2b 0.849 0.8180.868 89.7% 98.5% 100.0% 70.8% 64.4% 45.7% (V3) L01 0.812 0.787 0.69743.5% 23.9%  4.3% 51.4% 56.6% 14.3% L31 0.805 0.762 0.766  5.0%  2.7% 3.7% 25.0% 38.0%  7.4% X087885 0.807 0.748 0.697 20.3%  9.2%  14.6%41.7% 81.0% 50.9% (X7) PvEBP 0.794 0.739 0.707  5.0%  2.4%  3.1% 55.6%41.0%  7.8% (V11) L55 0.79 0.781 0.643 17.2% 20.9%  2.6% 38.9% 29.8% 3.5% PvRipr 0.754 0.772 0.646  3.0%  9.1%  3.1% 31.9% 29.3%  4.8% (V8)L54 0.79 0.727 0.654  5.6%  4.4%  3.1% 26.4% 19.0%  2.2% L07 0.747 0.7650.599  3.1%  5.3%  2.8% 27.8% 41.5%  3.9% L30 0.732 0.61 0.609  2.3% 3.8%  5.4% 47.2% 11.7%  9.6% PvDBPII 0.74 0.773 0.639  1.7%  2.6%  4.0%20.8% 47.3%  3.5% (V10) L34 0.767 0.746 0.67  4.5% 16.6%  2.2% 12.5%19.0%  3.9% X092995 0.792 0.703 0.642 11.5%  1.9%  5.6% 15.3% 34.1%10.0% (X6) L12 0.755 0.731 0.637  3.5%  6.1%  2.9% 16.7% 15.1%  3.0%RBP1b 0.533 0.578 0.525 24.1%  4.7%  2.5%  0.0%  0.0%  0.0% (V1) L230.759 0.753 0.658  4.0% 14.8%  2.9% 12.5% 19.5%  5.7% L02 0.746 0.7240.677  2.7%  3.7%  3.9% 15.3% 13.7%  2.6% L32 0.705 0.651 0.493  3.7% 1.9%  30.2%  4.2%  3.9%  0.4% L28 0.759 0.744 0.667  3.8%  2.5%  2.4%45.8% 33.2%  9.1% L19 0.753 0.67 0.664  2.6%  2.3%  6.5% 33.3% 19.5%10.9% L36 0.727 0.698 0.662  3.2%  1.8%  2.8% 36.1% 22.0% 10.4% L410.702 0.66 0.636  2.55  1.7%  3.3% 29.2% 17.6%  8.3% X088820 0.723 0.6660.638  4.0%  1.8%  6.7% 15.3% 35.6% 14.8% (X4) PvDBP.. 0.716 0.761 0.616 1.7%  2.6%  7.2% 16.7% 36.6%  1.3% Sacl (V13)

FIG. 25 shows Receiver Operating Characteristic (ROC) curves forassessing the trade-off between sensitivity and specificity for across-validated linear discriminant analysis (LDA) classifier applied todata from Thailand, Brazil and the Solomon Islands.

APPENDIX I Protein Insert aa sequence (add M asInsert DNA sequence (Start from No. Protein Name Referencestart/His-tag at C-term) ATG to His-tag stop codon) 1 merozote surfacePVX_099980 MNESKEILSQLLNVQTQLLTMSSEHT ATGAACGAGTCCAAGGAGATCCTCAGCCAACTprotein 1 (MSP1), CIDTNVPDNAACYRYLDGTEEWRCLLCCTGAACGTGCAAACCCAGCTCCTGACCATGT MSP1-19 TFKEEGGKCVPASNVTCKDNNGGCACCAGCGAGCACACCTGCATCGACACCAACGTC PEAECKMTDSNKIVCKCTKEGSEPLFCCAGACAACGCCGCCTGCTACAGGTACCTGGA EGVFCSHHHHHHCGGCACCGAGGAGTGGCGCTGCCTCCTGACCT TCAAGGAAGAGGGCGGCAAGTGCGTGCCAGCCTCCAACGTCACCTGCAAGGACAACAACGGCG GCTGCGCTCCAGAGGCTGAGTGCAAGATGACCGACAGCAACAAGATCGTGTGCAAGTGCACC AAGGAAGGCTCCGAGCCACTCTTCGAGGGCGTCTTCTGCAGCCACCACCACCACCACCACTGA 2 tryptophan-rich antigen PVX_096995MKTETVTSRSNPHQAIEYANQGPSR ACCCACACCAAGCCATCGAGTACGCCAACCAG (Pv-fam-a)DKVEEWKRNAWTDWMVQLDDDWK GGCCCATCCAGGGACAAGGTGGAGGAGTGGDFNAQIEEEKKAWIEEKEGDWVILLK AAGCGCAACGCCTGGACCGACTGGATGGTCCHLQNKWLHFNPNLDAEYQTDMLAKS AACTCGACGACGACTGGAAGGACTTCAACGCCETWDERQWKMWISTEGKQLLEMDL CAGATCGAGGAAGAGAAGAAGGCCTGGATTGKKWFTNNEMIYCKWTMDEWNEWKN AGGAGAAGGAAGGCGACTGGGTCATCCTCCTEKIKEWVTSEWKESEDQYWSKYDDA GAAGCACCTCCAAAACAAGTGGCTGCACTTCATIQTLTVAERNQWFKWKERIYREGIE ACCCAAACCTCGACGCCGAGTACCAGACCGACWKNWIAIKESKFVNANWNSWSEWK ATGCTGGCCAAGTCCGAGACGTGGGACGAGANEKRLEFNDWIEAFVEKWIRQKQWLI GGCAGTGGAAGATGTGGATCAGCACCGAGGGWTDERKNFANRQKAAPGGVAAAPG CAAGCAGCTCCTGGAGATGGACCTCAAGAAGVFAPRPAFGAPSGFAPRPGFAAPSQ TGGTTCACCAACAACGAGATGATCTACTGCAAPPRYSFAAASGYVAPSATSEAAPATS GTGGACCATGGACGAGTGGAACGAGTGGAAEAPASAEATTALSSETTTPVNPEETA GAACGAGAAGATCAAGGAGTGGGTGACCTCCASPEAATPVNPEETAASSETTTVNPE GAGTGGAAGGAGAGCGAGGACCAATACTGGTATPVNPEAPVAEPEKKEEEPAAEPLL CCAAGTACGACGACGCCACCATCCAAACCCTGAIEPAQTEPAALEAAPSTSAHHHHHH ACCGTCGCCGAGCGCAACCAGTGGTTCAAGTGGAAGGAGAGGATCTACCGCGAGGGCATCGA GTGGAAGAACTGGATCGCCATCAAGGAGAGCAAGTTCGTGAACGCCAACTGGAACTCCTGGTC TGAGTGGAAGAACGAGAAAAGGCTGGAGTTCAACGACTGGATCGAGGCCTTCGTCGAGAAGT GGATCCGCCAAAAGCAGTGGCTGATCTGGACCGACGAGAGGAAGAACTTCGCCAACCGCCAA AAGGCTGCTCCAGGCGGCGTGGCTGCCGCCCCAGGCGTCTTCGCCCCACGCCCAGCCTTCGGC GCCCCATCCGGCTTCGCCCCAAGGCCAGGCTTCGCTGCTCCAAGCCAGCCACCACGCTACTCCTT 3 sporozoite invasion- PVX_088860MQLELEPAPDYESTSPTVPVRLLLHD ATGAGTCCATCAGCCCAATCGTGCCAGTCAGGassociated protein 2, DYAPNAEDMFGPEASQVMTNLYETIDCTCCTGCTCCATGATGATTACGCCCCAAACGC putative (SIAP2)EDGTTTDGYQNGSDDDQSNQSDSN CGAGGACATGTTCGGCCCAGAGGCCTCCCAADDAVMLNYLSNETDSFDELIDEIDNHK GTGATGACCAACCTCTACGAGACGATCGACGAKKKKIYSPLRKPVLKRSDSSDSLSDY GGACGGCACCACCACCGACGGCTACCAAAACELDEVLRQTENEPEEDEDLDLSLEDS GGCTCCGACGACGACCAAAGCAACCAGTCCGFEVINYPWKDILESSPYSTDHTNEED ACAGCAACGACGACGCCGTCATGCTCAACTACFSSLEELELEDPVQEMNFGKLKFFEI CTGTCCAACGAGACGGACAGCTTCGACGAGCTGDPDLLIRKTPITPNTKTKSGLEKNGN CATCGACGAGATCGACAATCACAAGAAGAAGNTEASNINQHEKEKMDKRKRRTHKQ AAGAAGATCTACTCCCCACTCAGGAAGCCAGTFKNPIENFSVTTTYDDFLKQNGLRDH GCTGAAGCGCAGCGACTCCAGCGACTCCCTGAPSKHQKDSSEPFVLDQYNYRNAKFK GCGACTACGAGCTCGACGAGGTCCTGCGCCANVRFYILRMLYDNIKDIGLKEFQYLKS GACCGAGAACGAGCCAGAGGAAGACGAGGAHKYEVEEFIKNILRNNLICLTFSQEDHL CCTGGACCTCTCCCTGGAGGACAGCTTCGAGGFNDAHLLIEKASIKSEHHHHHH TCATCAACTACCCATGGAAGGACATCCTGGAGTCCAGCCCATACAGCACCGACCACACCAACGA GGAAGACTTCTCCAGCCTGGAGGAGCTGGAGCTGGAGGACCCAGTCCAAGAGATGAATTTCG GCAAGCTGAAGTTCTTCGAGATCGGCGACCCAGACCTGCTCATCAGGAAGACCCCAATCACCCC AAACACCAAGACCAAGTCCGGCCTGGAGAAGAATGGCAACAACACCGAGGCCAGCAACATCA ACCAGCACGAGAAGGAGAAGATGGACAAGCGCAAGAGGCGCACCCACAAGCAATTCAAGAA CCCAATCGAGAACTTCTCCGTGACCACCACCTACGACGACTTCCTCAAGCAAAACGGCCTGAGG GACCACCCAAGCAAGCACCAGAAGGACTCCAGCGAGCCATTCGTGCTCGACCAATACAACTAC 4 rhoptry neck protein 2, PVX_117880MNAGDGQGVYGGNGINNPLVYHVQ GCGGAAACGGCATCAACAACCCACTCGTGTACputative (RON2) HGVNIPNSNSDKKASDHTPDEDEDTYCACGTCCAGCACGGCGTCAACATCCCAAACTC GRTRNKRYMHRNPGEKYKGSNSPHCAACAGCGACAAGAAGGCCAGCGACCACACC DSNDDSGDTEYELNEGDVKRLTPKNCCAGACGAGGACGAGGACACCTACGGCAGGA KKGATTEEVDTYPYGKKTNGSEFPRCCCGCAACAAGAGGTACATGCACCGCAACCCA MNGSETGHYGYNNTGSGGHNDENGGGCGAGAAGTACAAGGGCTCCAACAGCCCAC YTPIIVKYDNTHAKNRANEIEENLNKGACGACTCCAACGACGACAGCGGCGACACCGA EYSRIKMAKGKKGQKSGGYESDGEDGTACGAGCTGAACGAGGGCGACGTGAAGAG SDVDSSNVFYVDNGQDMLIKEKMSRGCTCACCCCAAAGAACAAGAAGGGCGCCACC SEGPDEMSEEGLNVKYKAQRGPVNYACCGAGGAAGTGGACACCTACCCATACGGCA HFSNYMNLDKRNTLSSNEIELQKMIGAGAAGACCAACGGCAGCGAGTTCCCACGCAT PKFSEEVNKYCRLNEPSSKKGEFLNVGAACGGCTCCGAGACGGGCCACTACGGCTAC SFEYSRALEELRSEMINELQKRKAVGAACAACACCGGCAGCGGCGGCCACAACGACG SNYYNNILNAIYTSMNRKNANFGRDAAGAACGGCTACACCCCAATCATCGTGAAGTAC YEDKSFISEANSFRNEEMQPLSAKYNGACAACACCCACGCCAAGAACAGGGCCAACG KILRQYLCHVFVGNPGVNQLERLYFHAGATCGAGGAGAACCTCAACAAGGGCGAGTA NLALGELIEPIRRKYNKLASSSVGLNYCTCCCGCATCAAGATGGCCAAGGGCAAGAAG EIYIASSSNIYLMGHLLMLSLAYLSYNSGGCCAAAAGTCCGGCGGCTACGAGAGCGACG YFVQGLKPFYSLETMLMANSDYSFFGCGAGGACTCCGACGTCGACTCCAGCAACGT MYNEVCNVYYHPKGTFNKDITFIPIESGTTCTACGTCGACAACGGCCAGGACATGCTGA RPGRHSTYVGERKVTCDLLELILNAYTCAAGGAGAAGATGTCCAGGAGCGAGGGCCC TLINVHEIQKVFNTSEAYGYENSISFGAGACGAGATGAGCGAGGAAGGCCTCAACGTG HNAVRIFSQVCPRDDAKNTFGCDFEKAAGTACAAGGCCCAAAGGGGCCCAGTCAACT STLYNSKVLKMDEGDKENQRSLKRAACCACTTCTCCAACTACATGAACCTGGACAAG FDMLRTFAEIESTSHLGDPSPNYISLIFCGCAACACCCTCTCCAGCAACGAGATCGAGCT EQNLYTDFYKYLFWYDNRELINVQIRCCAGAAGATGATCGGCCCAAAGTTCAGCGAG NAGRRKKGKKVKFVYDEFVKRGKQLGAAGTGAACAAGTACTGCAGGCTGAACGAGC KDKLIKIDAKYNARSKALLVFYALVDKCATCCAGCAAGAAGGGCGAGTTCCTCAACGTC 5 Plasmodium exported PVX_101530MNVNKKSSGEENNTKQALGLRVSRT AGAACAACACCAAGCAAGCTCTGGGCCTGAGprotein, unknown LAKDGANENAEEGLSEEEEEAVEEGGGTGTCCCGCACCCTCGCTAAGGACGGCGCCA function EEEAVEEGEEEVVEEEGEEVVEGEEACGAGAACGCCGAGGAAGGCCTCAGCGAGGA EEVVEGEEEVVEDEEVVEGEEYAEGAGAGGAAGAGGCCGTCGAGGAAGGCGAGGA EEPVEGEEYAEGEEPVEGEEPVEVEAGAGGCCGTGGAGGAAGGCGAGGAAGAGGT EYAEGEEPVEGEEYAEGEEPVEGEEGGTCGAGGAAGAGGGCGAGGAAGTGGTCGA VVEGEEVVEGEEVAEGEEVAEGEEVGGGCGAGGAAGAGGAAGTGGTGGAGGGGG AEGEEAVEGEEVAEGEEVAEGEEVAAGGAAGAGGTGGTGGAGGATGAGGAAGTGG EGEEAAEEGAAEEGATEEGATEEGATGGAGGGCGAGGAGTACGCTGAGGGCGAGG TKEEATEKAAEGEETAESEKPAEEQPAGCCGGTGGAGGGGGAGGAGTACGCCGAGG TTFVETVEKKVEPVSKPPFKPLFPVDGGGAGGAGCCAGTGGAGGGCGAGGAGCCAG EKYLETLEDIAQSFLKEFQEAEGKRKTGGAGGTGGAGGAGTACGCGGAGGGGGAGG QKKVKKRAKKITKKLAKEYAKKFKSKAGCCGGTGGAAGGTGAGGAGTACGCCGAGG KKHHHHHH GCGAGGAGCCTGTCGAGGGGGAGGAAGTGGTGGAAGGCGAGGAAGTGGTGGAAGGTGAGG AAGTGGCTGAGGGCGAGGAAGTGGCCGAGGGGGAGGAAGTGGCCGAGGGCGAGGAAGCCG TGGAGGGCGAGGAAGTGGCGGAGGGGGAGGAAGTGGCGGAAGGCGAGGAAGTGGCCGAA GGCGAGGAAGCCGCTGAGGAAGGCGCTGCCGAGGAAGGCGCCACGGAGGAAGGCGCTACC GAGGAAGGCGCCACCAAGGAAGAGGCCACCGAGAAGGCTGCTGAGGGCGAGGAGACGGCT GAGTCCGAGAAGCCAGCTGAGGAGCAACCAACCACCTTCGTGGAGACGGTCGAGAAGAAGGT GGAGCCAGTCAGCAAGCCACCATTCAAGCCACTCTTCCCAGTCGACGAGAAGTACCTCGAAACC CTGGAGGACATCGCCCAATCCTTCCTGAAGGA 6tryptophan/threonine- PVX_112680 MPKPDQKNLKGGVKNAPLQQRKGSATGCCAAAGCCAGACCAAAAGAACCTCAAGG rich antigen VPINPPKPVNDKLKDGSNKTETKNAKGCGGCGTGAAGAACGCCCCACTGCAACAGAG NTLSKPPMQVTDKSKDEAKKTPLQSTGAAGGGCTCCGTGCCAATCAACCCACCAAAGC PKLTPKTKEVPKESNMEMWLKDTKDCAGTCAACGACAAGCTCAAGGACGGCAGCAA EYENLKCQYRTCLYDWFRKINDEYNECAAGACCGAGACGAAGAACGCCAAGAACACC LLNKLEEKWAKFPNDPKNKDVFDNLKCTGTCCAAGCCACCAATGCAAGTGACCGACAA TSSLKNDEKKAQWMRKNLKDLMREQGAGCAAGGACGAGGCCAAGAAGACCCCACTC VDEWLEGKKKIYEGMSPTYWDAWECAGTCCACCCCAAAGCTGACCCCAAAGACCAA KKIAKGLMGAAWYKMNSSGRTKEWGGAAGTGCCAAAGGAGAGCAACATGGAGATG DKLRNELETRYNKKIKSLWGGFHRDVTGGCTCAAGGACACCAAGGACGAGTACGAGA YFRFKEWIEEVFNKWIENKQIDTWMNACCTCAAGTGCCAGTACAGGACCTGCCTGTAC SGKKHHHHHHGACTGGTTCCGCAAGATCAACGACGAGTACAA CGAGCTCCTGAACAAGCTGGAGGAGAAGTGGGCCAAGTTCCCAAACGACCCAAAGAACAAGG ACGTGTTCGACAACCTCAAGACCTCCAGCCTGAAGAACGACGAGAAGAAGGCCCAGTGGATGA GGAAGAACCTCAAGGACCTGATGAGGGAGCAGGTGGACGAGTGGCTGGAGGGCAAGAAGAA GATCTACGAGGGCATGTCCCCAACCTACTGGGACGCCTGGGAGAAGAAGATCGCTAAGGGCCT GATGGGCGCTGCTTGGTACAAGATGAACTCCTCCGGCAGGACCAAGGAGTGGGACAAGCTCAG GAACGAGCTCGAAACCCGCTACAACAAGAAGATCAAGTCCCTCTGGGGCGGCTTCCACAGGGA CGTGTACTTCCGCTTCAAGGAGTGGATCGAGGAAGTGTTCAACAAGTGGATCGAGAACAAGCA AATCGACACCTGGATGAACAGCGGCAAGAAGCACCACCACCACCACCACTGA 7 hypothetical protein PVX_097715MQYSIVKNEITKRRKPKIRNESPPDG CAAGAGGCGCAAGCCAAAGATCAGGAACGAGNSPGGGKNNAAGNNGGGDNNAKNK TCCCCACCAGACGGCAACAGCCCAGGCGGCGAANKAANNAANKAANNAANNAANNA GCAAGAACAACGCTGCTGGCAACAACGGCGGANNAANNAANNAANNAANNAANNAA CGGCGACAACAACGCCAAGAACAAGGCTGCTNNAANNANEQNGNKKKKGKPKKEEA AACAAGGCTGCTAACAACGCCGCCAACAAGGDLPVQAQNENDRNKIEDIADEAELFA CCGCCAACAACGCTGCTAACAACGCCGCGAACEEAKMLADLASKRSKEVEQILSSIPEN AACGCCGCCAACAACGCCGCCAACAACGCAGKFGSEPKEDAIFAAKDAVRASEDAMK CTAACAACGCCGCTAACAACGCGGCCAACAACAAQKARAAETVTQANEEKDKAKTAK GCCGCGAACAACGCGGCGAACAACGCTGCCAELAERSAQIVKKNAVEALKEFGKIAEA ACAACGCCAACGAGCAAAACGGCAACAAGAAAEMEAIKIPIPENLKPKKKVKQPRAAA GAAGAAGGGCAAGCCAAAGAAGGAAGAGGCQKVEPTQATAHKVVPPPAEPPRAPS CGACCTCCCAGTGCAAGCCCAGAACGAGAACPPPPPAKPEAAPPAKEVAPAVTTPEA GACAGGAACAAGATCGAGGACATCGCTGArGPKEEAPKADAAPAAPQPAAESKVAK AGGCTGAGCTGTTCGCTGAGGAAGCCAAGATEPTDQSAENQSDSLYKETNIKEGTEE GCTCGCCGACCTGGCCTCCAAGCGCAGCAAGAGTGQEQKQEPELQNLLEQQMNIFYI GAAGTGGAGCAGATCCTCTCCAGCATCCCAGALVQFFKSKIKALIKFLLILVSHHHHHH GAACAAGTTCGGCTCCGAGCCAAAGGAAGACGCCATCTTCGCTGCTAAGGACGCCGTGAGGGC TAGCGAGGACGCCATGAAGGCTGCTCAAAAGGCCAGGGCCGCTGAGACGGTCACCCAGGCCA ACGAGGAGAAGGACAAGGCTAAGACCGCTAAGGAGCTGGCTGAGAGGTCCGCTCAAATCGTG AAGAAGAACGCCGTCGAGGCCCTGAAGGAGTTCGGCAAGATCGCCGAGGCCGCCGAGATGGA GGCCATCAAGATCCCAATCCCAGAGAACCTGAAGCCAAAGAAGAAGGTGAAGCAACCAAGGGC CGCCGCCCAAAAGGTGGAGCCAACCCAAGCTACCGCTCACAAGGTGGTGCCACCACCAGCTGA 8 41K blood stage antigen PVX_084420MDENTGWPIDYEFNSKTLPSIEVKLS ACGAGTTCAACTCCAAGACCCTGCCAAGCATCprecursor 41-3, putative PPENPLPQVAAEIKLLESARLKLEEGGAGGTGAAGCTCTCCCCACCAGAGAACCCACT MMQKLEDEYNKSLSSAKIKIQDTVEKGCCACAAGTCGCCGCCGAGATCAAGGTCCTGG SLSIFNDPNMLGSVISNSVKMLRSENAGAGCGCCCGCCTCAAGCTCGAAGAGGGCAT VKKRTENVQAKHNLKKMQTVNQAKSGATGCAGAAGCTGGAGGACGAGTACAACAAG GPLPPPELRKHTSFLEQNYVNRVLPSTCCCTGTCCAGCGCCAAGATCAAGATCCAAGA VKISLSELTEPSVEIKEKIEEMEQYRTCACCGTGGAGAAGTCCCTCAGCATCTTCAACG DEEVAMFEMAISEFSILTDITILELEKQIACCCAAACATGCTGGGCTCCGTGATCTCCAAC QLQLNPFLVDKKVVHRALTKELKELEAGCGTCAAGATGCTCAGGAGCGAGAACGTGA QREEKQKIKENFQRQSSFIEAGEDEDAGAAGCGCACCGAGAACGTCCAGGCCAAGCA TGNILNVKISQTDYGYPTVDELVMQMCAACCTCAAGAAGATGCAGACCGTCAACCAAG QKRRDISEKLERQKILDLQMKLLKAQCCAAGAGCGGCCCACTCCCACCACCAGAGCTG SEMIKDALHFALSKVIAQYSPLVETMKCGCAAGCACACCTCCTTCCTGGAGCAAAACTA LESMRMLHHHHHHCGTGAACAGGGTCCTGCCATCCGTGAAGATCT CCCTCAGCGAGCTGACCGAGCCAAGCGTCGAGATCAAGGAGAAGATCGAGGAGATGGAGCA GTACAGGACCGACGAGGAAGTGGCCATGTTCGAGATGGCCATCTCCGAGTTCAGCATCCTCAC CGACATCACCATCCTGGAGCTGGAGAAGCAAATCCAGCTCCAACTGAACCCATTCCTCGTCGAC AAGAAGGTGGTCCACAGGGCCCTGACCAAGGAGCTCAAGGAGCTGGAGCAGCGCGAGGAGA AGCAAAAGATCAAGGAGAACTTCCAGAGGCAATCCAGCTTCATCGAGGCTGGCGAGGACGAG GACACCGGCAACATCCTCAACGTGAAGATCTCCCAGACCGACTACGGCTACCCAACCGTGGACG AGCTCGTCATGCAGATGCAAAAGAGGCGCGACATCTCCGAGAAGCTGGAGCGCCAGAAGATC 9 rhoptry-associated PVX_085930MSSDGKSSASAKSGSKSGSKYGGSS CTAAGTCCGGCAGCAAGTCCGGCAGCAAGTAprotein 1, putative YSDYSAYDSGSASSVGSREFENEMYCGGCGGCTCCAGCTACTCCGACTACAGCGCCT (RAP1) EFALQHPMEKLTKEMDILKNDYTKVKACGACTCCGGCAGCGCCTCCAGCGTGGGCAG EEEGKILDEEHKEIEEKRKEERLKMLACCGCGAGTTCGAGAACGAGATGTACGAGTTC EGDVEKNKGDEEINFIKHDYTDTRIRGGCCCTGCAACACCCGATGGAGAAGCTCACCAA GFTEFLSNLNPFKKEIKPMKKEISLITYGGAGATGGACATCCTGAAGAACGACTACACC IPDKIVNKEKIMRDLGISHKYEPYQQSIAAGGTGAAGGAAGAGGAAGGCAAGATCCTCG LYTCPNSVFFFDSMENLRKELDKNHEACGAGGAGCACAAGGAGATCGAGGAGAAGA KEAITNKILDHNKECLKNFGLFDFELPGGAAGGAAGAGCGCCTCAAGATGCTGGCCGA DNKTKLGNVIGSIGEYHVRLYEIENDLGGGCGACGTGGAGAAGAACAAGGGCGACGA LKYQPSLDYMTLADDYKLVKNDVNTLGGAGATCAACTTCATCAAGCACGACTACACCG ENVNFCLLNPKTLEDFLKKKEIMELMACACCAGGATCCGCGGCGGCTTCACCGAGTTC GEDPIAYEEKFTKYMEESINCHLESLICTCTCCAACCTGAACCCATTCAAGAAGGAGAT YEDLDSSQDTKIVLKNVKSKLYLLQNCAAGCCGATGAAGAAGGAGATCTCCCTCATCA GLTYKSKKLINKLFNEIQKNPEPIFEKLCCTACATCCCAGACAAGATCGTCAACAAGGAG TWIYENMYHLKRDYTFLAFKTVCDKYAAGATCATGCGCGACCTGGGCATCTCCCACAA VSHNSIYTSLQGMTSYIIEYTRLYGACGTACGAGCCATACCAACAGAGCATCCTCTACA FKNITIYNAVISGIHEQMKNLMKLMPRCCTGCCCAAACTCCGTGTTCTTCTTCGACAGCA SGLLSDVHFEALLHKENKKITRTDYVLTGGAGAACCTCAGGAAGGAGCTGGACAAGAA NDYDPSVKAYALTQVERLPMVSVINSCCACGAGAAGGAAGCCATCACCAACAAGATC FFEAKKKALSKMLAQMKLDLFTLTNECTCGACCACAACAAGGAGTGCCTCAAGAACTT DLKIPNDKGANSKLTAKLISIYKAEIKKCGGCCTGTTCGACTTCGAGCTCCCAGACAACA YFKEMRDDYVFLIKARYKGHYKKNYLAGACCAAGCTGGGCAACGTCATCGGCTCCATC LYKRLEHHHHHHGGCGAGTACCACGTGAGGCTCTACGAGATCG AGAACGACCTCCTGAAGTACCAACCAAGCCTGGACTACATGACCCTCGCCGACGACTACAAGCT GGTGAAGAACGACGTCAACACCCTGGAGAACGTGAACTTCTGCCTCCTGAACCCAAAGACCCT 10 hypothetical protein, PVX_094830MNTRASKFANSKRKRNGNAMRENKL ATGAACACCAGGGCCTCCAAGTTCGCCAACAG conservedNNDDVDHYSFLSLRTANEEKAATEND CAAGAGGAAGCGCAACGGCAACGCCATGCGCSNNAKKEGEENTNGNEKKNEENGSG GAGAACAAGCTCAACAACGACGACGTGGACCNEKRNEENNANEKKNEQTNDQSNG ACTACTCCTTCCTCAGCCTGAGGACCGCTAACQSNSQTNIPKKNEAVPPEKKINKENLL GAGGAGAAGGCTGCTACCGAGAACGACTCCAEYGTHDKDGHFIPSYKTLTDEILSTNN ACAACGCCAAGAAGGAAGGCGAGGAGAACASLERASSFLKIACSHIMKIVEFIPESKL CCAACGGCAACGAGAAGAAGAACGAGGAGASSQYIKVESKNVYIKDITSECQNIFFSL ACGGCAGCGGCAACGAGAAGCGCAACGAGGEKLTMTMIVLNSKMNKLVYVQDKHHH AGAACAACGCTAACGAGAAGAAGAACGAGCA HHHAACCAACGACCAGTCCAACGGCCAATCCAACA GCCAGACCAACATCCCAAAGAAGAACGAGGCCGTCCCACCAGAGAAGAAGATCAACAAGGAG AACCTCCTGGAGTACGGCACCCACGACAAGGACGGCCACTTCATCCCAAGCTACAAGACCCTC ACCGACGAGATCCTGTCCACCAACAACAGCCTGGAGAGGGCCTCCAGCTTCCTGAAGATCGCCT GCTCCCACATCATGAAGATCGTGGAGTTCATCCCAGAGTCCAAGCTGTCCAGCCAATACATCAA GGTGGAGAGCAAGAACGTCTACATCAAGGACATCACCTCCGAGTGCCAGAACATCTTCTTCAGC CTGGAGAAGCTGACCATGACCATGATCGTCCTCAACAGCAAGATGAACAAGCTGGTCTACGTGC AAGACAAGCACCACCACCACCACCACTGA 11tryptophan-rich antigen PVX_112675 MPKPAQNLKGGVKKPSLQQTKSPLPATGCCAAAGCCAGCCCAAAACCTCAAGGGCG (Pv-fam-a) SKPPKPVNDKLKDDSNKTETKDAKNGCGTGAAGAAGCCATCCCTCCAACAGACCAAG GLNKPPKNINDKVKDGENKTESQDLNTCCCCACTGCCAAGCAAGCCACCAAAGCCAGT EPSFKLPMRQKASSWDAWLKGTKKCAACGACAAGCTCAAGGACGACAGCAACAAG DYENLKCFAKGNLYDWLCSVRDSFEACCGAGACGAAGGACGCCAAGAACGGCCTGA LYLQSLESKWTSCSDNTTTVFLCECLACAAGCCACCAAAGAACATCAACGACAAGGT AESSGWGDPQWESWVKKELKEQLKGAAGGACGGCGAGAACAAGACCCCATCCCAA TEAQAWISTKKKDFDGLTSKYFSLWKGACCTCAACGAGCCAAGCTTCAAGCTGCCAAT DHRRKELEEEAWKTKASSGGLSEWEGAGGCAAAAGGCCTCCAGCTGGGACGCTTGG ELTDKMNTRYTNNLDNMWSNYSGDLCTCAAGGGCACCAAGAAGGACTACGAGAACC LFRFDEWSPEVLEKWIESKQWNQWTGAAGTGCTTCGCCAAGGGCAACCTCTACGAC VKKVRKHHHHHHTGGCTGTGCTCCGTCCGCGACAGCTTCGAGCT CTACCTGCAATCCCTGGAGAGCAAGTGGACCTCCTGCAGCGACAACACCACCACCGTGTTCCTC TGCGAGTGCCTCGCTGAGTCCAGCGGCTGGGGCGACCCACAGTGGGAGTCCTGGGTCAAGAA GGAGCTCAAGGAGCAACTGAAGACCGAGGCCCAGGCCTGGATCAGCACCAAGAAGAAGGACT TCGACGGCCTCACCTCCAAGTACTTCAGCCTGTGGAAGGACCACAGGCGCAAGGAGCTGGAGG AAGAGGCCTGGAAGACCAAGGCCTCCAGCGGCGGCCTCTCCGAGTGGGAGGAGCTGACCGAC AAGATGAACACCAGGTACACCAACAACCTCGACAACATGTGGTCCAACTACAGCGGCGACCTCC TGTTCCGCTTCGACGAGTGGTCCCCAGAGGTGCTGGAGAAGTGGATCGAGAGCAAGCAGTGGA ACCAGTGGGTGAAGAAGGTCAGGAAGCACCACCACCACCACCACTGA 12 tryptophan-rich antigen PVX_112670MVTEGGDNLDDDLGGDLEGLLGDDA ACGACCTCGGCGGCGACCTGGAGGGCCTCCT (Pv-fam-a)EGGAAGGEGAAAAASAEGLSGEVEN GGGCGACGACGCTGAGGGCGGCGCCGCCGGELLYVKEDDDDAPAATPDEKPSTSGE CGGCGAGGGCGCTGCCGCCGCCGCCTCCGCCETPAAFVDLVNETVPPPAKAPLPLQT GAGGGCCTGAGCGGCGAGGTGGAGAACGAGKAPQGPKIKDWNQWMKQAKKDFSG CTCCTCTACGTGAAGGAAGACGACGACGACGYKGTMHTQRHEWTKEKEDELQKFCK CTCCAGCTGCTACCCCAGACGAGAAGCCATCCYLEKRWMNYTGNIDRECRSDFLKST ACCAGCGGCGAGGAGACGCCAGCTGCTTTCGQNWNESQWNKWVKSEGKHHMNKQ TGGACCTCGTCAACGAGACGGTGCCACCACCAFQKWLDYNKYKLQDWTNTEWNKWK GCTAAGGCCCCACTCCCACTGCAAACCAAGGCTTVKEQLDDEEWKKKEAAGKTKEWI CCCACAGGGCCCAAAGATCAAGGACTGGAACKCTDKMEKKCLKKTKKHCKNWEKKA CAGTGGATGAAGCAGGCCAAGAAGGACTTCTNSSFKKWEGDFTKKWTSNKQWNS CCGGCTACAAGGGCACCATGCACACCCAAAG WCKELEKHHHHHHGCACGAGTGGACCAAGGAGAAGGAAGACGA GCTGCAGAAGTTCTGCAAGTACCTGGAGAAGCGCTGGATGAACTACACCGGCAACATCGACAG GGAGTGCCGCTCCGACTTCCTGAAGAGCACCCAAAACTGGAACGAGTCCCAGTGGAACAAGTG GGTGAAGAGCGAGGGCAAGCACCACATGAACAAGCAATTCCAGAAGTGGCTGGACTACAACAA GTACAAGCTCCAAGACTGGACCAACACCGAGTGGAACAAGTGGAAGACCACCGTCAAGGAGCA GCTGGACGACGAGGAGTGGAAGAAGAAGGAAGCCGCCGGCAAGACCAAGGAGTGGATCAAG TGCACCGACAAGATGGAGAAGAAGTGCCTCAAGAAGACCAAGAAGCACTGCAAGAACTGGGA GAAGAAGGCCAACTCCAGCTTCAAGAAGTGGGAGGGCGACTTCACCAAGAAGTGGACCTCCA ACAAGCAGTGGAACAGCTGGTGCAAGGAGCT 13Hyp, huge list of PVX_002550 MAVEVVQEAADEVLEEEKIEEPLEIVEACGAGGTGCTCGAAGAGGAGAAGATCGAGG orthologs, paralogs,EEPVQVAAEEPVEEVLEEVVQEAAD AGCCACTGGAGATCGTGGAGGAAGAGCCAGTsynteny with Py LSA3 EVMEEEKIEEPLEIVAEEPLEIVAEEPVGCAAGTCGCCGCCGAGGAGCCAGTCGAGGAA (PyLSA3syn-2)QVAAEEVLVEKEEVNENILNIVEEIKE GTGCTCGAAGAGGTGGTGCAAGAGGCCGCCGSIVDKLEANEEASEEGNEDLLESAEE ACGAGGTCATGGAGGAAGAGAAGATCGAGGAAEEVAEEAVDTTTEADVVETVEEEA AGCCTCTGGAGATCGTCGCTGAAGAACCTCTGANATTEVSAEESLEVSTEAPEETTES GAGATCGTGGCTGAGGAGCCTGTGCAGGTGGESHETFEEDILKNLEENKEANENALE CTGCCGAGGAAGTGCTGGTCGAGAAGGAAGADIKEMKEEFLDYVEQRVEDNENVLVD GGTGAACGAGAACATCCTCAACATCGTGGAGLLQHLERNAHVNESVLEDLEEIKEDLL GAGATCAAGGAGAGCATCGTCGACAAGCTGGANIQMAEETRKEVTDASAESAEEVEE AGGCCAACGAGGAAGCCAGCGAGGAAGGCAPVEVSAEVAAEEPVEVAAEEPVEVTA ACGAGGACCTCCTGGAGTCCGCTGAGGAAGCEEPVEVTAEEPVEIPTEENIFDVIEEIK CGCTGAGGAAGTGGCTGAGGAAGCCGTGGACEKVLENLEETTAESVAESVGEGADEN ACCACCACCGAGGCTGACGTGGTGGAGACGGALDVLKEMQESLLENFGQKIEANENIL TGGAGGAAGAGGCCGCTAACGCTACCACCGAASVLENIQEKVELNKSVLVDVLAELKE GGTGTCCGCTGAGGAGAGCCTGGAGGTGTCCEAVSQRETAQEVAAELVEEAAEVPAV ACCGAGGCTCCAGAGGAGACGACCGAGTCCGEPVEEEVVEPAVEVVEEPVEEEVVEP AGAGCCACGAGACGTTCGAGGAAGACATCCTVVDVIEEPAVEVVEVPVEETVEEPVE GAAGAACCTGGAGGAGAACAAGGAAGCCAACVTAEEPVEVTAEEPVEETVEEPVVEV GAGAACGCCCTGGAGGACATCAAGGAGATGAVEEPVEEPVVEAIEEPVVEPVVEPAV AGGAAGAGTTCCTCGACTACGTGGAGCAAAGEVIEDATEEPVEEAAEEPDVEVAEGS GGTCGAGGACAACGAGAACGTGCTGGTCGACAIESVEEAFEQIIEDAAQVIAEESVEET CTCCTGCAGCACCTGGAGCGCAACGCCCACGTAEQILEQATQAVTEEAADAADVADAE GAACGAGAGCGTCCTGGAGGACCTGGAGGAEAVGTAQVVTEESVAEAIEDTVEEISA GATCAAGGAAGACCTCCTGGCCAACATCCAAAEPIQATIEGIVGEVVESVEENIEAVEEA TGGCCGAGGAGACGAGGAAGGAAGTGACCGIKDIVEGAVEGAPELSLEEMIEDVMVG ACGCTTCCGCTGAGAGCGCTGAGGAAGTGGATVAEEDSAKEAAEETVEEVVQEDAAE GGAGCCCGTCGAGGTGTCCGCTGAGGTGGCT 14conserved Plasmodium PVX_090970 mTYMLMKDDDSHDDKDDENEEKKKKATGACCTACATGCTCATGAAGGACGACGACTC protein, unknownEGKTNKDTNKIIKGESMTREDLLQLLN CCACGACGACAAGGACGACGAGAACGAGGA functionEMLKLQTDMKNIVKDLIVVAKKNSYDF GAAGAAGAAGAAGGAAGGCAAGACCAACAAMSVYNVAKTYNTVDPLGKYQIEMPEF GGAGACCAACAAGATCATCAAGGGCGAGAGCDKVVENYHFDPEVKETVSKLMSSQE ATGACCAGGGAGGACCTCCTGCAACTCCTGAANYYANMSETATLNVDKIIEIHHFMLNE CGAGATGCTCAAGCTGCAGACCGACATGAAGLYKIDPEFKKIPNKHELDPKLIALVIQSI AACATCGTCAAGGACCTCATCGTGGTCGCCAAVSAKVEEEFNLTSEDVEASIANQQYA GAAGAACTCCTACGACTTCATGAGCGTGTACALTSNMEFARVNIQMOTIMNKFMGDhh ACGTCGCCAAGACCTACAACACCGTGGACCCA hhhhCTGGGCAAGTACCAAATCGAGATGCCAGAGT TCGACAAGGTGGTCGAGAACTACCACTTCGACCCAGAGGTGAAGGAGACGGTGTCCAAGCTCA TGTCCAGCCAGGAGAACTACTACGCCAACATGAGCGAGACGGCCACCCTGAACGTCGACAAGA TCATCGAGATCCACCACTTCATGCTCAACGAGCTGTACAAGATCGACCCAGAGTTCAAGAAGAT CCCAAACAAGCACGAGCTGGACCCAAAGCTCATCGCCCTCGTGATCCAATCCATCGTGAGCGCC AAGGTCGAGGAAGAGTTCAACCTCACCTCCGAGGACGTCGAGGCCAGCATCGCCAACCAACAG TACGCCCTGACCTCCAACATGGAGTTCGCCCGCGTGAACATCCAAATGCAGACCATCATGAACA AGTTCATGGGCGACCACCACCACCACCACCAC TGA 15conserved Plasmodium PVX_084815 mAGGVSEEAIKKLKEIKKLELDILKDFATGGCCGGCGGCGTCAGCGAGGAAGCCATCA protein, unknownMKQDAGHADLYKKYHCIASDYISGNP AGAAGCTCAAGGAGATCAAGAAGCTGGAGCT functionKGSSAEGPNLAKKGEKSKKGEKHQN GGACATCCTGAAGGACTTCATGAAGCAAGACGEKPQNGEKPKKSFIEKIASFVSIFSY GCCGGCCACGCCGACCTCTACAAGAAGTACCANNVSKIYSEHVQRIFPKARDHAGDGS CTGCATCGCCAGCGACTACATCTCCGGCAACCAGDAIYPDDKIETGKKQNQSSYVQLS CAAAGGGCTCCAGCGCTGAGGGCCCAAACCTALNLMKRNMFLGGKDKSSEHFEVGN GGCCAAGAAGGGCGAGAAGAGCAAGAAGGGLGSFYMIFGARNTDYPWACSCDPLQ CGAGAAGCACCAAAACGGCGAGAAGCCACAGLIDYKEKKRNYVLCSNQVDMSIQNAD AACGGCGAGAAGCCAAAGAAGTCCTTCATCG LFCNPKhhhhhhAGAAGATCGCCTCCTTCGTGAGCATCTTCTCCT ACAACAACGTCAGCAAGATCTACTCCGAGCACGTGCAAAGGATCTTCCCAAAGGCCCGCGACCA CGCTGGCGACGGCAGCGCCGGCGACGCCATCTACCCAGACGACAAGATCGAGACGGGCAAGA AGCAAAACCAGTCCAGCTACGTCCAGCTCTCCGCCCTCAACCTGATGAAGCGCAACATGTTCCT GGGCGGCAAGGACAAGTCCAGCGAGCACTTCGAAGTGGGCAACCTCGGCAGCTTCTACATGAT CTTCGGCGCCAGGAACACCGACTACCCATGGGCCTGCTCCTGCGACCCACTCCAGCTGATCGACT ACAAGGAGAAGAAGCGCAACTACGTGCTCTGCAGCAACCAAGTCGACATGTCCATCCAGAACG CCGACCTGTTCTGCAACCCAAAGCACCACCACCACCACCACTGA 16 tryptophan-rich antigen PVX_090270mVSCTSLCLYIIYSLFLLNNVSLSIQVK ATCTACAGCCTCTTCCTCCTGAACAACGTGTCC(Pv-fam-a) TNEIKNGQNGSVQLKEKGGGVNLAP CTGAGCATCCAAGTCAAGACCAACGAGATCAAKVGTNITQKRDTKMAKKTVTKVAKKK GAACGGCCAAAACGGCTCCGTCCAGCTCAAGVTKVAEKTGTKVADKTGTKVADKTGT GAGAAGGGCGGCGGCGTGAACCTGGCTCCAAKVADKTGTKVAEKTGTKVADKTGTK AGGTCGGCACCAACATCACCCAGAAGAGGGAVAEKTGTNISQKEDEKGPPKEDTQGT CACCAAGATGGCCAAGAAGACCGTGACCAAGQKADAKAIQQADAQVSEKWKKKEWK GTCGCCAAGAAGAAGGTCACGAAGGTCGCCGEWIKKAESDLDIFNALMDNEKEKKWY AGAAGACCGGCACCAAGGTGGCCGACAAGACSEKEKEWNKWIKGVEKKWMHYNKNI CGGCACCAAGGTCGCTGATAAGACGGGGACGYVEYRSLVFWVGLKWVESQWEKWIL AAGGTCGCTGATAAGACCGGGACGAAGGTGGSDGLEFLVMDWKKWIKENKSNFDEW CTGAGAAGACGGGGACGAAGGTTGCTGATAALKSEWDTWTNSQMEEWKSSNWKLN GACGGGGACCAAGGTGGCTGAGAAGACCGGEDKRWEMWENDKKWIKWLYLKDWI CACCAACATCAGCCAAAAGGAAGACGAGAAGNCSKWKKRIQKESKEWLRWTKLKEE GGCCCACCAAAGGAAGACACCCAAGGCACCC MYhhhhhhAGAAGGCCGACGCCAAGGCCATCCAACAGGC CGACGCCCAGGTGAGCGAGAAGTGGAAGAAGAAGGAGTGGAAGGAGTGGATCAAGAAGGC CGAGTCCGACCTCGACATCTTCAACGCCCTGATGGACAACGAGAAGGAGAAGAAGTGGTACA GCGAGAAGGAGAAGGAGTGGAACAAGTGGATCAAGGGCGTGGAGAAGAAGTGGATGCACTA CAACAAGAACATCTACGTCGAGTACAGGTCCCTCGTGTTCTGGGTCGGCCTGAAGTGGGTGGA GTCCCAATGGGAGAAGTGGATCCTCAGCGACGGCCTGGAGTTCCTGGTCATGGACTGGAAGA AGTGGATCAAGGAGAACAAGTCCAACTTCGACGAGTGGCTCAAGAGCGAGTGGGACACCTGG ACCAACTCCCAGATGGAGGAGTGGAAGTCCA 17apical membrane PVX_092275 mGEDAEVENAKYRIPAGRCPVFGKGIAAGTACAGGATCCCAGCTGGCAGGTGCCCAG antigen 1, AMA1VIENSDVSFLRPVATGDQKLKDGGFA TGTTCGGCAAGGGCATCGTCATCGAGAACTCC(Orthologs with Pf FPNANDHISPMTLANLKERYKDNVEMGACGTGAGCTTCCTCCGCCCAGTGGCTACCGG vaccine candidates)MKLNDIALCRTHAASFVMAGDQNSS CGACCAAAAGCTGAAGGACGGCGGATTCGCCYRHPAVYDEKEKTCHMLYLSAQENM TTCCCAAACGCCAACGACCACATCTCCCCAATGGPRYCSPDAQNRDAVFCFKPDKNES ACCCTCGCCAACCTGAAGGAGAGGTACAAGGFENLVYLSKNVRNDWDKKCPRKNLG ACAACGTGGAGATGATGAAGCTCAACGACATNAKFGLWVDGNCEEIPYVKEVEAEDL CGCTCTGTGCAGGACCCACGCTGCTAGCTTCGRECNRIVFGASASDQPTQYEEEMTD TGATGGCTGGCGACCAGAACTCCAGCTACAGYQKIQQGFRQNNREMIKSAFLPVGAF GCACCCAGCCGTCTACGACGAGAAGGAGAAGNSDNFKSKGRGFNWANFDSVKKKCY ACCTGCCACATGCTCTACCTGTCCGCCCAAGAIFNTKPTCLINDKNFIATTALSHPQEVD GAACATGGGCCCAAGGTACTGCTCCCCAGACLEFPCSIYKDEIEREIKKQSRNMNLYS GCTCAGAACAGGGACGCTGTCTTCTGCTTCAAVDGERIVLPRIFISNDKESIKCPCEPER GCCAGACAAGAACGAGTCCTTCGAGAACCTCGISNSTCNFYVCNCVEKRAEIKENNQV TGTACCTGAGCAAGAACGTCAGGAACGACTGVIKEEFRDYYENGEEKSNKQhhhhhh GGACAAGAAGTGCCCACGCAAGAACCTCGGCAACGCCAAGTTCGGCCTGTGGGTGGACGGCA ACTGCGAGGAGATCCCATACGTGAAGGAAGTGGAGGCCGAGGACCTCAGGGAGTGCAACAG GATCGTCTTCGGCGCTTCCGCTAGCGACCAACCAACCCAGTACGAGGAAGAGATGACCGACTA CCAAAAGATCCAACAGGGCTTCAGGCAGAACAACCGCGAGATGATCAAGTCCGCCTTCCTCCC AGTGGGCGCCTTCAACTCCGACAACTTCAAGAGCAAGGGCCGCGGCTTCAACTGGGCCAACTTC GACAGCGTGAAGAAGAAGTGCTACATCTTCAACACCAAGCCAACCTGCCTGATCAACGACAAGA ACTTCATCGCCACCACCGCCCTCTCCCACCCAC 18hypothetical protein PVX_084720 mNGNRNLNIKPTCHKSGKNDKANGSCAACCTGCCACAAGAGCGGCAAGAACGACAA DNIANKGGAQHAANGATGTPSGSSNGGCCAACGGCTCCGACAACATCGCTAACAAG GKKGATTTSASAGQAGASGGMAAPGGCGGCGCCCAACACGCTGCTAACGGCGCCA GMNPNFEQMMKPLNDMFKGNGEGLCCGGCACCCCAAGCGGCTCCAGCAACGGCAA NIENIMNSDMFQNFFNSLMGGNPHDGAAGGGCGCTACGACCACCAGCGCTTCCGCT GAGGGQEILFKDMLNAMNAQGGGAPGGCCAAGCTGGCGCTTCCGGCGGCATGGCCG GAAATSGGANKDPNISVSPEQLNKINCCCCAGGCATGAACCCAAACTTCGAGCAGATG QLKDKLENVLKNVGVDVEQLKENMQATGAAGCCACTGAACGACATGTTCAAGGGCA NENIMQNKDALRDLLANLPMNPGMMACGGCGAGGGCCTCAACATCGAGAACATCAT QNMMAGKDGNMFNMDPNQMMNMFGAACAGCGACATGTTCCAGAACTTCTTCAACT NQLSQGKMNMKDFGMGDFMPPPVHCCCTGATGGGCGGCAACCCACACGACGGCGC ANDQDAEDDSRGKAFVTNSSNNDINTGGCGGCGGCCAAGAGATCCTGTTCAAGGAC FAHKLNAFEYSNGPSEGMFQLYGMNATGCTCAACGCCATGAACGCCCAAGGCGGCG NDDGVIDDGMSDSVGKNSALDVSGGGCGCCCCAGGCGCTGCCGCCACCTCCGGCGG SINRNLSDGDSAKEDSDESNANATSNCGCCAACAAGGACCCAAACATCAGCGTCTCCC SNATVPNKGGHEGGSANEVYSNEEECAGAGCAGCTGAACAAGATCAACCAACTCAA LITSSGSKGDANKLAGTGGYKNNNAFGGACAAGCTGGAGAACGTGCTCAAGAACGTG LDLNNLKKDASAAKYGKDNSGDKSNGGCGTCGACGTGGAGCAGCTCAAGGAGAACA GGNSNGGNNKVMNKRIGGKKKKTFKTGCAAAACGAGAACATCATGCAGAACAAGGA KKKNPGQIPFKMETLQKLVKEYTNTSCGCTCTGAGGGACCTCCTGGCTAACCTCCCGA NQKIMEKIIKKYVSMSNQSARGNSEETGAACCCAGGCATGATGCAAAACATGATGGCC EDDEEEAEDEKSAKDKNSEKEAELNGGCAAGGACGGCAACATGTTCAACATGGACC MNEFSVKDIKKLISEGILTYEDLTEEELCAAACCAGATGATGAACATGTTCAACCAACTC KKLAKPDDMFYELSPYANEEKDLSLNAGCCAGGGCAAGATGAACATGAAGGACTTCG ETSGVSNEQLNAFLRKNGSYHMSYDGCATGGGCGACTTCATGCCACCACCAGTCCAC SKAIDYLKQKKAEKKEEEQEDDNFYDGCCAACGACCAAGACGCTGAGGACGACTCCC AYKQIKNSYEGIPSNYYHDAPQLIGENGCGGCAAGGCTTTCGTGACCAACTCCAGCAAC YVFTSVYDKKKELIDFLKRSNGATDSAACGACATCAACTTCGCCCACAAGCTGAACGC 19 merozoite surface PVX_003770mPLEVSLWGQGNAHLGTQTSRLLRE GCAACGCTCACCTCGGCACCCAAACCTCCCGC protein 5SGRNGQANRVNQADQADQVASPPIS CTGCTCAGGGAGTCCGGCAGGAACGGCCAGGGKERRRGIGMTSNLQLLSGEDEKDS CCAACAGGGTGAACCAGGCTGACCAGGCTGATSEEAPNLEGKDNADAGKDGEKEPS CCAAGTGGCTTCCCCACCAATCTCCGGCAAGGEKQSGDVDPTVTDAERAKDENASVS AGAGGCGCAGGGGCATCGGCATGACCTCCAAEEEQMKTLDSGEDHTDDGNADGGQ CCTCCAACTCCTGAGCGGCGAGGACGAGAAGGGGDGNDENQKGDGKEKEGGEEKK GACTCCACCAGCGAGGAAGCCCCAAACCTGGEDGKDDHEKGEKGSEGESGEKDEA AGGGCAAGGACAACGCTGACGCTGGCAAGGAAPKGDAAEKDKKLESKTADAKVSEH TGGCGAGAAGGAGCCATCCGAGAAGCAGAGCKADDANPGGNKDSPEGESPKEGNPD GGCGACGTGGACCCAACCGTCACCGACGCTGDPSQKNPEAAGDDDSRLHLDNLDDK AGAGGGCTAAGGACGAGAACGCTTCCGTCAGVPHYSALRNNRVEKGVTDTMVLNDII CGAGGAAGAGCAGATGAAGACCCTGGACAGCGENAKSCSVDNGGCADDQICIRIDNI GGCGAGGACCACACCGACGACGGCAACGCTGGIKCICKEGHLFGDKCILTKhhhhhh ACGGCGGACAAGGCGGCGGCGACGGCAACGACGAGAACCAAAAGGGCGACGGCAAGGAGA AGGAAGGCGGCGAGGAGAAGAAGGAAGACGGCAAGGACGACCACGAGAAGGGCGAGAAGG GCTCCGAGGGCGAGAGCGGCGAGAAGGACGAGGCTGCTCCAAAGGGCGACGCTGCCGAGAA GGACAAGAAGCTGGAGTCCAAGACCGCCGACGCCAAGGTGAGCGAGCACAAGGCTGACGACG CTAACCCAGGCGGCAACAAGGACTCCCCAGAGGGCGAGAGCCCAAAGGAAGGCAACCCAGAC GACCCATCCCAGAAGAACCCGGAGGCTGCTGGCGACGACGACAGCCGCCTCCACCTGGACAAC CTCGACGACAAGGTCCCACACTACTCCGCCCTGCGCAACAACAGGGTGGAGAAGGGCGTCACC GACACCATGGTGCTGAACGACATCATCGGCG 20TRAg (Pv-fam-a) PVX_092990 mDVLQLVIPSEEDIQLDKPKKDELGSGGAAGACATCCAGCTCGACAAGCCAAAGAAG GILSILDVHYQDVPKEFMEEEEETAVYGACGAGCTGGGCAGCGGCATCCTCTCCATCCT PLKPEDFAKEDSQSTEWLTFIQGLEGGGACGTIGCACTACCAAGACGTCCCAAAGGAG DWERLEVSLNKARERWMEQRNKEWTTCATGGAGGAAGAGGAAGAGACGGCCGTGT AGWLRLIENKWSEYSQISTKGKDPAACCCACTCAAGCCAGAGGACTTCGCCAAGGAA GLRKREWSDEKWKKWFKAEVKSQIGACTCCCAAAGCACCGAGTGGCTCACCTTCAT DSHLKKWMNDTHSNLFKILVKDMSQCCAAGGCCTGGAGGGCGACTGGGAGAGGCT FENKKTKEWLMNHWKKNERGYGSEGGAGGTGTCCCTGAACAAGGCCAGGGAGCGC SFEVMTTSKLLNVAKSREWYRANPNITGGATGGAGCAAAGGAACAAGGAGTGGGCT NRERRELMKWFLLKENEYLGQEWKKGGCTGGCTCAGGCTGATCGAGAACAAGTGGT WTHWKKVKFFVFNSMCTTFSGKRLTCCGAGTACAGCCAGATCTCCACCAAGGGCAA KEEWNQFVNEIKVhhhhhhGGACCCGGCTGGCCTCAGGAAGCGCGAGTGG TCCGACGAAAAGTGGAAGAAGTGGTTCAAGGCCGAGGTGAAGAGCCAAATCGACTCCCACCTG AAGAAGTGGATGAACGACACCCACAGCAACCTCTTCAAGATCCTGGTCAAGGACATGTCCCAG TTCGAGAACAAGAAGACCAAGGAGTGGCTCATGAACCACTGGAAGAAGAACGAGAGGGGCTA CGGCTCCGAGAGCTTCGAGGTCATGACCACCAGCAAGCTCCTGAACGTCGCCAAGTCCAGGGA GTGGTACTGCGCCAACCCAAACATCAACCGCGAGAGGCGCGAGCTCATGAAGTGGTTCCTCCTG AAGGAGAACGAGTACCTGGGCCAAGAGTGGAAGAAGTGGACCCACTGGAAGAAGGTGAAGTT CTTCGTCTTCAACAGCATGTGCACCACCTTCTCCGGCAAGCGCCTGACCAAGGAAGAGTGGAAC CAGTTCGTGAACGAGATCAAGGTCCACCACCACCACCACCACTGA 21 unspecified product PVX_112690mEAMPKFPQNNLKGGLKDSPLKQPK ATGGAGGCCATGCCAAAGTTCCCACAAAACAASPLINGPPKPVNDKLKDDSNKTETKD CCTCAAGGGCGGCCTGAAGGACTCCCCACTCAAKNGLNKPPKNINDKVKDGENKTPS AGCAGCCAAAGAGCCCACTGATCAACGGCCCQDLNEPSFKLPMRQKESSWYTWLK ACCAAAGCCAGTGAACGACAAGCTCAAGGACGTKKDYETLKCFAKGNLYDWLCNVR GACTCCAACAAGACCGAGACGAAGGACGCCAESFDLYLQSLEKKWTTCSDSATTLFL AGAACGGCCTGAACAAGCCACCAAAGAACATCECFAESSGWNDSQWGNWMNNQL CAACGACAAGGTCAAGGACGGCGAGAACAAGKEQLKTEAEAWISTKKKDFDGLTSKY ACCCCATCCCAAGACCTCAACGAGCCAAGCTTFSLWKDHRRKELDADEWKNKVSSG CAAGCTGCCAATGAGGCAGAAGGAGTCCAGCGLSEWEELTNKMNTRYRNNLDNMW TGGTACACCTGGCTCAAGGGCACCAAGAAGGSHFSRDLFFNFDEWAPQVLEKWIEN ACTACGAGACGCTGAAGTGCTTCGCCAAGGGKQWNRWVKKVRKhhhhhh CAACCTCTACGACTGGCTGTGCAACGTGCGCGAGTCCTTCGACCTCTACCTGCAAAGCCTGGAG AAGAAGTGGACCACCTGCTCCGACAGCGCTACCACCCTCTTCCTGTGCGAGTGCTTCGCCGAGT CCAGCGGCTGGAACGACTCCCAGTGGGGCAACTGGATGAACAACCAACTCAAGGAGCAGCTG AAGACCGAGGCCGAGGCCTGGATCAGCACCAAGAAGAAGGACTTCGACGGCCTCACCTCCAAG TACTTCAGCCTGTGGAAGGACCACAGGCGCAAGGAGCTCGACGCCGACGAGTGGAAGAACAA GGTGTCCAGCGGCGGCCTCAGCGAGTGGGAGGAGCTGACCAACAAGATGAACACCAGGTACC GCAACAACCTCGACAACATGTGGTCCCACTTCAGCAGGGACCTGTTCTTCAACTTCGACGAGTG GGCCCCACAAGTCCTGGAGAAGTGGATCGAGAACAAGCAGTGGAACCGCTGGGTGAAGAAGG TCCGCAAGCACCACCACCACCACCACTGA 22petidase, M16 family PVX_091710 mQRAPNNGRNNYGLNDDELGAILFGACTACGGCCTCAACGACGACGAGCTGGGCGC LNYDSIAKNKDNLEKRKNVENESIFLRCATCCTCTTCGGCCTGAACTACGACAGCATCG NFANEDTSKNTQSEKAQKEIKIETETECCAAGAACAAGGACAACCTGGAGAAGAGGAA SVNSNEKEVATSQKSDTSNKNSSVEGAACGTCGAGAACGAGTCCATCTTCCTGCGCA NEKIELKNDELLGKNFEKDKVNKKGDACTTCGCCAACGAGGACACCAGCAAGAACACC NTNTTNNHDLTNSSEKQGVDIRGSKCAATCCGAGAAGGCCCAGAAGGAGATCAAGA NMNNYLQKTGDTNIEKSESLQKDVNITCGAGACGGAGACGGAGTCCGTCAACAGCAA KNHNEEANDAKRLDSAQTNNEKSKISCGAGAAGGAAGTGGCCACCTCCCAGAAGAGC KDTIDKDVQSNELTNLASNRSNKKSQGACACCTCCAACAAGAACTCCAGCGTCGAGAA GLAKKENELKSANLEENHNAKKDLLKCGAGAAGATCGAGCTGAAGAACGACGAGCTC KDQKREDGKKITHPENSNSDQYGVQCTGGGCAAGAACTTCGAGAAGGACAAGGTGA VSLNDEEKNTNTKSVSHSEDHSASYACAAGAAGGGCGACAACACCAACACCACCAA SGEKFGTHVSNSQKDMLKNIRPVQFCAACCACGACCTCACCAACTCCAGCGAGAAGC DESAYGKLNGGSPENDENEILNKINKAAGGCGTCGACATCAGGGGCAGCAAGAACAT NNENNFSEKVALRKGTKDRNEYEYFGAACAACTACCTCCAAAAGACCGGCGACACCA KLKSNDFKVLGIINKYSSRGGFSISVDACATCGAGAAGTCCGAGAGCCTGCAGAAGGA CGGYDDFDEVPGVSNLLQHAIFYKSECGTGAACATCAAGAACCACAACGAGGAAGCC KRNTTLLSELGKYSSEYNSCTSESSTAACGACGCCAAGAGGCTGGACAGCGCCCAGA SYYATAHSEDIYHLLNLFAENLFYPVFCCAACAACGAGAAGAGCAAGATCTCCAAGGA SEEHIQNEVKEINNKYISIENNLESCLKCACCATCGACAAGGACGTGCAATCCAACGAGC IASQYITNFKYSKFFVNGNYTTLCENVTCACCAACCTGGCCAGCAACCGCTCCAACAAG LKNRLSIKNILTEFHKKCYQPRNMSLTAAGAGCCAGGGCCTCGCCAAGAAGGAGAACG ILLGNKVNTADHYNMKDVENMVVHIFAGCTCAAGTCCGCCAACCTGGAGGAGAACCA GKIKNESYPIDGDVIGKRINRMESERVCAACGCCAAGAAGGACCTCCTGAAGAAGGAC NLYGKKDSYNDANFIHIEGRNEKEAACAAAAGAGGGAGGACGGCAAGAAGATCACCC FLQSMNELHYALDLNQKSRYVEIIKKEACCCAGAGAACTCCAACAGCGACCAATACGGC EWGDQLYLYWSSKTNAELCKKIEEFGTGCAAGTGTCCCTGAACGACGAGGAGAAGA GSMTFLREIFSDFRRNGLYYKISVENKACACCAACACCAAGTCCGTCAGCCACTCCGAG 23 rhoptry-associated PVX_087885mKEAVKKGSKKAMKQPMHKPNLLEE ATGAAGGAAGCCGTGAAGAAGGGCTCCAAGAmembrane antigen, EDFEEKESFSDDEMNGFMEESMDASAGGCCATGAAGCAACCAATGCACAAGCCAAA RAMA KLDAKKAKTTLRSSEKKKTPTSGMSGCCTCCTGGAGGAAGAGGACTTCGAGGAGAAG MSGSGATSAATEAATNMNATAMNAAGAGTCCTTCAGCGACGACGAGATGAACGGCT AKGNSEASKKQTDLSNEDLFNDELTETCATGGAGGAGTCCATGGACGCCAGCAAGCT EVIADSYEEGGNVGSEEAESLTNAFDGGACGCCAAGAAGGCCAAGACCACCCTCAGG DKLLDQGVNENTLLNDNMIYNVNMVPTCCAGCGAGAAGAAGAAGACCCCAACCTCCG HKKRELYISPHKHTSAASSKNGKHHAGCATGAGCGGCATGTCCGGCAGCGGCGCTAC ADADALDKKLRAHELLELENGEGSNSCAGCGCTGCTACCGAGGCCGCCACCAACATGA VIVETEEVDVDLNGGKSSGSVSFLSSACGCTACCGCCATGAACGCTGCCGCCAAGGG VVFLLIGLLCFTNhhhhhhCAACTCCGAGGCTAGCAAGAAGCAAACCGAC CTCTCCAACGAGGACCTGTTCAACGACGAGCTCACCGAGGAAGTGATCGCCGACAGCTACGAG GAAGGCGGCAACGTGGGCTCCGAGGAAGCCGAGAGCCTGACCAACGCCTTCGACGACAAGCT CCTGGACCAGGGCGTGAACGAGAACACCCTCCTGAACGACAACATGATCTACAACGTGAACAT GGTCCCACACAAGAAGAGGGAGCTCTACATCTCCCCACACAAGCACACCAGCGCCGCCTCCAGC AAGAACGGCAAGCACCACGCTGCTGACGCTGACGCTCTGGACAAGAAGCTCAGGGCTCACGA GCTCCTGGAGCTGGAGAACGGCGAGGGCTCCAACAGCGTGATCGTCGAGACGGAGGAAGTGG ACGTGGACCTGAACGGCGGCAAGTCCTCCGGCTCCGTCAGCTTCCTCTCCAGCGTGGTCTTCCT CCTGATCGGCCTCCTGTGCTTCACCAACCACCACCACCACCACCACTGA 24 HP, conserved PVX_003555 mDDNGRRLPRKAAPPVDKAKQDVMAGGCTGCCCCACCAGTGGACAAGGCCAAGCA KDIVNYLSKNMLAFVRQKRNVSGKEGGGACGTGATGAAGGACATCGTCAACTACCTCT EAPTGPSGAQGGDSSQYASKFTFTDCCAAGAACATGCTGGCCTTCGTGAGGCAAAA HSVDFSKYNKLDKEKFAAKDDLKSRLGCGCAACGTCTCCGGCAAGGAAGGCGAGGCT KNEVVASMLDTEGDILTEEFGYLLRNCCAACCGGCCCAAGCGGCGCTCAAGGCGGCG YFDKVKLEEKKSQEAESAKPAEQEEEACTCCAGCCAGTACGCCAGCAAGTTCACCTTC AEEAPEQKEEATAEKATEETTEAATEACCGACCACTCCGTGGACTTCAGCAAGTACAA ETTEAATEETTEAATEETTEAATEETTCAAGCTCGACAAGGAGAAGTTCGCCGCCAAG EAATEETTEAATEETTEAATEETTEAGACGACCTCAAGTCCAGGCTGAAGAACGAGG ATEEATEGATEEGAEETTEEATEEGATGGTCGCCAGCATGCTCGACACCGAGGGCGA EEATEEGAEEATEEGAEETTEEATEECATCCTGACCGAGGAGTTCGGCTACCTCCTGC GAEETTEETTEEGAEEEATEEGAEETGCAACTACTTCGACAAGGTCAAGCTGGAGGA TEEGAEEAAEEGAEEGAEAATEEATGAAGAAGTCCCAAGAGGCCGAGAGCGCTAAG EEATEEATEEATEEATEEATEEATAECCAGCTGAGCAAGAGGAAGAGGCCGAGGAA VAEAATPEKVTEEATEEATEEGDNEPGCCCCAGAGCAAAAGGAAGAGGCCACCGCTG AEQAAEKEEDVKGGLMDNETYYNTLAGAAGGCTACCGAGGAGACGACCGAGGCTGC QELYEEIENDDKKEKEKIQKAKEQEECACGGAGGAGACGACGGAGGCCGCCACGGA LEKKLFKESKKGKKKEKKRRKKLCKMGGAGACGACCGAGGCCGCCACCGAGGAGAC AKIVEKYAEEIPKDSERSLRYDKEEHIGACGGAGGCTGCCACTGAAGAGACGACCGAG DDPDEMDDLLFGEFKTLEKYGTHKTSGCTGCGACGGAAGAGACGACCGAGGCCGCGA TFYYEMTCFDERLRDFEINTKLKEMECGGAAGAGACGACTGAGGCTGCCACTGAGGA EVPEKWELLSLYWQSYRNERHKYLAGACGACGGAAGCTGCTACCGAGGAAGCCACC VKKYLLEKFLELKTNQSTEALPKYNKGAGGGCGCTACCGAGGAAGGCGCTGAGGAG KWKQCEEIVDNNFTKQHEHVNDVFYACGACGGAGGAAGCCACGGAGGAAGGCGCT TFVAKENLSRDEFKEILNDVRASWhhGAGGAAGCCACCGAGGAAGGCGCCGAGGAA hhhh GCCACGGAGGAAGGCGCAGAGGAGACGACAGAGGAAGCCACGGAGGAAGGCGCCGAAGAG ACGACCGAAGAGACGACCGAGGAAGGCGCG 25phosphatidylinositol-4- PVX_117385 MRCCTKDAVNVESPKKVVVGETEEDTGGAGTCCCCAAAGAAGGTGGTCGTGGGCGA phosphate-5-kinase,TREEENPYEDLPTVTVTLSDGSVYTG GACGGAGGAAGACACCAGGGAGGAAGAGAA putativeTTKDNRVHGRGVLKYVNGDQYEGEF CCCATACGAGGACCTCCCAACCGTCACCGTGAVDGKKEGKGKWTDKENNTYEGDWV CCCTGTCCGACGGCAGCGTCTACACCGGCACCKDKRHGHGVYKTAEGFIFEGEFANNK ACCAAGGACAACAGGGTGCACGGCCGCGGCGREGKGTIITPEKTKYVCSFQDDEEVG TCCTCAAGTATGTGAACGGCGACCAATACGAGEVEFFFANGDHALGYIKDGYLCQNGR GGCGAGTTCGTCGACGGCAAGAAGGAAGGCAYEFKNGDIYVGNFEKGLFHGEGYYK AGGGCAAGTGGACCGACAAGGAGAACAACACWNNDANYTIYEGNYSEGKKHGKGQL CTACGAGGGCGACTGGGTCAAGGACAAGAGGINKDGRILCGMFRDNNMDGEFLEISP CACGGCCACGGCGTGTACAAGACCGCTGAGGQGNQTKVLYDKGFFVKVLDKIEENLD GCTTCATCTTCGAGGGCGAGTTCGCCAACAACVQEFLKDSIIHTTIFSDPTTYKKLYEITE AAGCGCGAGGGCAAGGGCACCATCATCACCCKKKPQFRLNLKRTQPTShhhhhh CAGAGAAGACCAAGTATGTGTGCAGCTTCCAAGACGACGAGGAAGTGGGCGAGGTGGAGTTCT TCTTCGCCAACGGCGACCACGCCCTCGGCTACATCAAGGACGGCTACCTGTGCCAGAACGGCC GCTACGAGTTCAAGAACGGCGACATCTACGTGGGCAACTTCGAGAAGGGCCTGTTCCACGGCG AGGGCTACTACAAGTGGAACAACGACGCCAACTACACCATCTACGAGGGCAACTACTCCGAGG GCAAGAAGCACGGCAAGGGCCAACTCATCAACAAGGACGGCAGGATCCTGTGCGGCATGTTC CGCGACAACAACATGGACGGCGAGTTCCTGGAGATCAGCCCACAAGGCAACCAGACCAAGGT CCTCTACGACAAGGGCTTCTTCGTCAAGGTGCTGGACAAGATCGAGGAGAACCTCGACGTGCA GGAGTTCCTGAAGGACTCCATCATCCACACCACCATCTTCAGCGACCCAACCACCTACAAGAAG 26 Plasmodium exported PVX_113225mNKLGTSLVEDATANGEFGLRVQRL ACGCTACCGCTAACGGCGAGTTCGGCCTCGCprotein, unknown LGGSRSSRDSIFADSFYDDDDDDDDGTCCAAAGGCTGCTGGGCGGCTCCAGGTCCA function NNDKLFDYDSDHKSRREVKDRHHRHGCCGCGACAGCATCTTCGCCGACTCCTTCTAC RHSHSHRHKRRHSHKHRTSSRSRREGATGATGACGACGACGACGACGACAACAACG KEESSTTNDDDDEVLSLSRFDVDDDKACAAGCTGTTCGACTACGACAGCGACCACAAG DDRSHSRYSVDYDDENDDEPSSSRPTCCAGGCGCGAGGTGAAGGACAGGCACCACA ASTDYDDIIDLTNARRSGSKYRISSMDGGCACAGGCACAGCCACTCCCACCGCCACAAG IELYPEHEDEYLFEGKRRSGGVLKKAAGGCGCCACAGCCACAAGCACAGGACCTCCA DNYCENKIFDALSALDKYKEYYGEERGCCGCTCCAGGCGCGAGAAGGAAGAGTCCAG RVMKQAAYRKATKVFAIPGAAALSPLICACCACCAACGACGACGACGACGAGGTGCTC ITLFLTTSNVVALPLAASAVILGGILYKAGCCTGTCCAGGTTCGACGTCGACGACGACAA KSKDKSDYGRPHLKSITYhhhhhhGGACGACAGGAGCCACTCCCGCTACAGCGTG GACTACGACGACGAGAACGACGACGAGCCATCCAGCTCCAGGCCAGCCTCCACCGACTACGAC GACATCATCGACCTCACCAACGCTAGGCGCAGCGGCTCCAAGTACCGCATCAGCTCCATGGACA TCGAGCTCTACCCAGAGCACGAGGACGAGTACCTGTTCGAGGGCAAGAGGCGCAGCGGCGGC GTCCTGAAGAAGGCTGACAACTACTGCGAGAACAAGATCTTCGACGCCCTCTCCGCCCTGGAC AAGTACAAGGAGTACTACGGCGAGGAGAGGCGCGTGATGAAGCAGGCCGCCTACAGGAAGGC CACCAAGGTCTTCGCTATCCCAGGCGCTGCCGCCCTCAGCCCACTGATCATCACCCTCTTCCTGA CCACCAGCAACGTGGTGGCTCTCCCACTGGCTGCTTCCGCCGTCATCCTCGGCGGCATCCTGTA CAAGAAGAGCAAGGACAAGTCCGACTACGGCCGCCCACACCTCAAGTCCATCACCTACCACCAC 27 tryptophan-rich antigen PVX_090265MEAARGVSGLVPSSNSLQEITLRYKD TCCCATCCAGCAACAGCCTCCAAGAGATCACC (Pv-fam-a)KLLNMDKEQMILTLGVTMIAITSAVAF CTGCGCTACAAGGACAAGCTCCTGAACATGGAGVLATHGDINDFLGVESDEESEKKKE CAAGGAGCAGATGATCCTCACCCTGGGCGTCAIVEKSEEWKRKEWSNWLKKLEQDW CCATGATCGCTATCACCTCCGCTGTGGCTTTCGKVFNEKLQNEKKTFLEEKEEDWNTWI GCGTCCTGGCTACCCACGGCGACATCAACGACKSVEKKWTHFNPNMDKEFHTNMMR TTCCTGGGCGTCGAGTCCGACGAGGAGAGCGRSINWTESQWREWIQTEGRLYLDIE AGAAGAAGAAGGAGATCGTGGAGAAGTCCGWKKWFFENQSRLDELIVKKWIQWKK AGGAGTGGAAGAGGAAGGAGTGGAGCAACTDKIINWLMSDWKRAEQEHWEEFEEK GGCTCAAGAAGCTGGAGCAAGACTGGAAGGTSWSSKFFQIFEKRNYEDFKDRVSDE CTTCAACGAGAAGCTCCAGAACGAGAAGAAGWEDWFEWVKRKDNIFITNVLDQWIK ACCTTCCTGGAGGAGAAGGAAGAGGACTGGAWKEEKNLLYNNWADAFVTNWINKKQ ACACCTGGATCAAGTCCGTGGAGAAGAAGTGWVVWVNERRNLAAKAKAALNKKKhh GACCCACTTCAACCCAAACATGGACAAGGAGT hhhhTCCACACCAACATGATGAGGCGCTCCATCAAC TGGACCGAGAGCCAATGGCGCGAGTGGATCCAGACCGAGGGCAGGCTCTACCTGGACATCGA GTGGAAGAAGTGGTICTICGAGAACCAAAGCAGGCTCGACGAGCTGATCGTGAAGAAGTGGA TCCAGTGGAAGAAGGACAAGATCATCAACTGGCTCATGTCCGACTGGAAGCGCGCCGAGCAA GAGCACTGGGAGGAGTTCGAGGAGAAGAGCTGGTCCAGCAAGTTCTTCCAGATCTTCGAGAA GCGCAACTACGAGGACTTCAAGGACCGCGTGAGCGACGAGTGGGAGGACTGGTTCGAGTGG GTCAAGCGCAAGGACAACATCTTCATCACCAACGTGCTGGACCAGTGGATCAAGTGGAAGGAA GAGAAGAACCTCCTGTACAACAACTGGGCCGACGCCTTCGTCACCAACTGGATCAACAAGAAG 28 MSP7 famiiy PVX_082700mTKGPSGPPPNKKLNANALHFLRGK CAAGAAGCTCAACGCCAACGCCCTCCACTTCCLELLNKISEEQVVSPDFKKNVELLKKK TGAGGGGCAAGCTGGAGCTCCTGAACAAGATIEELQGKAEKDKSKTDGEDTTPKEQQ CTCCGAGGAGCAAGTGGTCAGCCCAGACTTCAEDQNVSQNGLEEQAPSDSNEGEAQ AGAAGAACGTCGAGCTCCTCAAGAAGAAGATEENTQVKNVIFTEKEEAVDEEAEKED CGAGGAGCTCCAGGGCAAGGCCGAGAAGGATAVISEKANFPNEESQGNDETQTQES CAAGTCCAAGACCGACGGCGAGGACACCACCIEGEASPGVVVDETDDSPEGEPLSGL CCAAAGGAGCAACAAGAGGACCAAAACGTGAETEGNSSAESAPNEPDVNTTHTAVD GCCAGAACGGCCTGGAGGAGCAAGCTCCGTCTHMPADANIGVDTNMPFDTPPHPSG CGACAGCAACGAGGGCGAGGCTCAAGAGGAENPGAPQETHLPSIDENANRRASRM GAACACCCAGGTCAAGAACGTGATCTTCACCGKHMSSFLNGLLTNQSNNKKEIFFHPY AGAAGGAAGAGGCCGTCGACGAGGAAGCCGYGPYFNHGGYYNYDPYYNYAPAYNP AGAAGGAAGACACCGCCGTGATCTCCGAGAAFVSQARDYEVIKKLLDACFNKGEGAD GGCCAACTTCCCAAACGAGGAGAGCCAGGGCPNVPCIIDIFKKVLDDERFRNELKTFM AACGACGAGACGCAAACCCAAGAGTCCATCGYDLYEFLKKNDVLSDDEKKNELMRFF AGGGCGAGGCTAGCCCGGGCGTGGTGGTGGFDNAFQLVNPMFYYhhhhhh ACGAGACGGACGACTCCCCGGAGGGCGAGCCACTCAGCGGCCTCGAAACCGAGGGCAACTCCA GCGCTGAGTCCGCTCCAAACGAGCCAGACGTCAACACCACCCACACCGCTGTGGACACCCACAT GCCAGCTGACGCCAACATCGGCGTCGACACCAACATGCCATTCGACACCCCACCACACCCAAGC GGCGAGAACCCGGGCGCCCCACAAGAGACGCACCTCCCATCCATCGACGAGAACGCCAACAGG CGCGCCAGCAGGATGAAGCACATGTCCAGCTTCCTGAACGGCCTCCTGACCAACCAGTCCAACA ACAAGAAGGAGATCUCTTCCACCCATACTACGGCCCATACTTCAACCACGGCGGATACTACAA CTACGACCCATACTACAACTACGCCCCAGCCTA 29Hyp, huge list of PVX_002550 mFSGGVGDDEEEEEEEEGEEGESEAAGAGGAAGAGGAAGAGGAAGGCGAGGAAG orthologs, paralogs,RDDSERDYAGRDDAGRDDAERNDA GCGAGAGCGAGAGGGACGACTCCGAGAGGGsynteny with Py LSA3 ERDDAERNDAERDDAERDHAERDHAACTACGCTGGCAGGGACGATGCCGGCAGGGA (PyLSA3syn-3) DKAESDRESSLEANENRLVKLSEGGCGACGCCGAGAGGAACGACGCCGAGCGCGAT ESEPALLEVEEDIKQTVLGMFSLKGEGATGCTGAGCGCAACGACGCCGAGCGCGACG FDEAESEKLALDLQKNLLSMLSGNMEACGCCGAGAGGGACCACGCCGAGCGCGACCA DNDDEYEDIDEEYEEVEEDYEEEKLGCGCCGACAAGGCCGAGTCCGACAGGGAGTCC KPVEVVVEDATEEAVDEVVGVVQEPAGCCTGGAGGCCAACGAGAACAGGCTGGTGA EEEGAEESDKDTGEVSEEEVAKEAAAGCTCAGCGAGGGCGGCGAGTCCGAGCCAGC DEVMEEEKKEEAGEPSVVVEEPSVVTCTCCTGGAGGTGGAGGAAGACATCAAGCAA VKEPSVVVKEPSVVVEEPSVVVEEPSACCGTCCTGGGCATGTTCAGCCTCAAGGGCGA VVVEEPSVVVEEPAFTVEEPAFTVEEGTTCGACGAGGCCGAGTCCGAGAAGCTCGCC PAITVEEPAITVEEPVFTVEEPVFTVECTGGACCTCCAGAAGAACCTCCTGTCCATGCT EPAFTVEEPAFTVEEPAFTVEEPATTCAGCGGCAACATGGAGGACAACGACGACGAG VEELVEEVLKVAEEEVATEAVEKDGETACGAGGACATCGACGAGGAGTACGAGGAAG EAEEQVTEESVEEDEEESGEEEGEETGGAGGAAGACTACGAGGAAGAGAAGCTCG SEEEETEESAEEEVAKESVEEEVAKEGCAAGCCAGTGGAGGTGGTCGTGGAGGACGC AEESEESGEESAEEEKEKAEEPVAPVCACCGAGGAAGCCGTGGACGAGGTGGTGGG DEVLKEGMQKIEESVKEALGVVQEAVCGTCGTGCAAGAGCCAGAGGAAGAGGGCGCT DKVAEEEQTEQAQGPAEAGPVGVVKGAGGAGAGCGACAAGGACACCGGCGAGGTG EPEEEEESEEEGEEGEEGEEGEEEETCCGAGGAAGAGGTGGCCAAGGAAGCCGCCG EEESEEEESEEGESEAGESEAGKSDACGAGGTCATGGAGGAAGAGAAGAAGGAAG AAESEVAESEAGEPAEDQAGMDAKMAGGCCGGCGAGCCATCCGTGGTGGTGGAGGA KDELLGMLSEKMKAEGKDLDKLPPEGCCAAGCGTGGTCGTGAAGGAGCCATCCGTC VKKNLLDMLAGNMEMDDEEEEGEEEGTGGTCAAGGAGCCTTCCGTGGTCGTGGAGG GEDLGNEELDLQKNLLEMLSGKGGFAGCCTAGCGTCGTCGTCGAGGAGCCTTCCGTC NPNMLGNLKELEALQKSVPGLMGKAGTGGTGGAGGAGCCCAGCGTGGTCGTCGAGG QGISPAEIESLKSMFSGAFDSRGFKGAGCCAGCCTTCACCGTGGAGGAGCCTGCCTTC 30 MSP7-like protein PVX_082650mQLGIQKKKKNLEQDAMHALMKKLE ACCTGGAGCAGGACGCCATGCACGCCCTCATGSLYKLSATDNGEIFNKEIDALKKQIDQ AAGAAGCTGGAGAGCCTGTACAAGCTCTCCGLHQHGGGNEGESLGHLLESEAADDS CCACCGACAACGGCGAGATCTTCAACAAGGAGKKTIFGVDEDDLDNYDADFIGQSKG GATCGACGCCCFGAAGAAGCAAATCGACCAGKIKGQADTTAVAKPPTGSGAGAHGS CTCCACCAACACGGCGGCGGAAACGAGGGCGHSPPKPSVLVVPGKSGKEDSVATLEN AGAGCCTGGGCCACCTCCTGGAGAGCGAGGCGYESIHGEDEPREDSTSHDSPPALPV TGCTGACGACTCCGGCAAGAAGACCATCTTCGGRSEGDSSASGGGTEGQQPDPASA GCGTGGACGAGGACGACCTGGACAACTACGARGSQASGGRGGGDQTNTTQPAGGQ CGCCGACTTCATCGGCCAGTCCAAGGGCAAGQSSSAARSLQAPHAGDSQLPNAGGD ATCAAGGGCCAGGCTGACACCACCGCTGTGGPQSPAAAGHQQPPTSPPANNEGTTV CTAAGCCACCAACCGGCAGCGGCGCTGGCGCTQESALAATPPKGTADSNDAKIKYLD TCACGGCAGCCACICCCCACCAAAGCCATCCGKLYDEVLTTSDNTSGIHVPDYHSKYN TGCTCGTGGTCCCAGGCAAGAGCGGCAAGGATIRQKYEYSMNPVEYEIVKNLFNVGF AGACTCCGTCGCCACCCTGGAGAACGGCTACGKNDGAASSDATPLVDVFKKALADEKF AGAGCATCCACGGCGAGGACGAGCCAAGGGAQAEFDNFVHGLYGFAKRHSYLSEAR GGACAGCACCTCCCAGGACTCCCCACCAGCTCMKDNKLYSDLLKNAISLMSTLQVShhh TCCCAGTGGGCCGCAGCGAGGGCGACTCCAG hhhCGCTTCCGGCGGCGGCACCGAGGGCCAACAG CCAGACCCAGCTAGCGCCAGGGGCAGCCAGGCTTCCGGCGGCAGGGGCGGCGGCGACCAAAC CAACACCACCCAACCAGCTGGCGGCCAACAGTCCAGCTCCGCTGCTAGGAGCCTGCAGGCCCCA CACGCTGGCGACAGCCAGCTCCCAAACGCCGGCGGCGACCCACAATCCCCAGCTGCCGCCGGC CACCAACAGCCACCAACCTCCCCACCAGCCAACAACGAGGGCACCACCGTGACCCAAGAGTCC GCTCTGGCTGCTACCCCACCAAAGGGCACCGCCGACTCCAACGACGCCAAGATCAAGTACCTGG 31 reticulocyte binding PVX_094255mAAYNTVLQIYKYSDDIVRKQEKCEQ CAAGTACTCCGACGACATCGTGAGGAAGCAAprotein 2b (RBP2b) LVKDGKDICLKFKSINEIKVMIQNSKGGAGAAGTGCGAGCAGCTGGITAAGGACGGCA KESTLSAKVSHSFNKLSELNKIKCNDAGGACATCTGCCTCAAGITCAAGTCCATCAAC ESYDAILETPSREELNKLRSTFKQEKGAGATCAAGGTCATGATCCAGAACAGCAAGG DTIANQAKLSGYKTDFETHIGKLNDLAGCAAGGAGTCCACCCTCAGCGCCAAGGTGTCC KIVDNLKASETLPKNIEEKKTSINLISTCACAGCTTCAACAAGCTCAGCGAGCTGAACAA KLETIEKEIESINSSFDQLLEKGKKCEGATCAAGTGCAACGACGAGAGCTACGACGCC MTKYKLVRDSLSTKINDHSAIIKDNQKATCCTCGAAACCCCATCCAGGGAGGAGCTCAA KATEYLTYIQNNHISIFKDIDMLNENLGCAAGCTGCGCAGCACCTTCAAGCAAGAGAAG EKSVSRYAIAKIEEANDLSAQLTAAVSGACACCATCGCCAACCAGGCCAAGCTCTCCGG EYEAIANSIRKEFTNISDHTEMDTLENCTACAAGACCGACTTCGAGACGCACATCGGCA EAKMLKEHYDNLINKKNIITELHNKINLIAGCTCAACGACCTGGCCAAGATCGTGGACAAC KLLEIRATSDKYVDIAELLGEVVKDQKCTCAAGGCCAGCGAGACGCTGCCAAAGAACA KKLQEAKNKLDTLKDHAVKEKELINHTCGAGGAGAAGAAGACCTCCATCAACCTCATC DSSFTLVSIKAFDEIYDDIKYNVGQLHAGCACCAAGCTCGAAACCATCGAGAAGGAGA TLEVTNFDELKKGKTYEENVTHLLNRTCGAGTCCATCAACTCCAGCTTCGACCAACTCC RETLQNDLHNYEEKDKLKNTNIEMSNTGGAGAAGGGCAAGAAGTGCGAGATGACCAA EENNQIRQTSEVIKKLESEFQNLLKIIQGTACAAGCTCGTCAGGGACTCCCTGAGCACCA QSNTLCSNDNIKQFISDILKKVETIRERAGATCAACGACCACTCCGCCATCATCAAGGAC FVKNFPEREKYHQIEINYNEIKGIVKEVAACCAAAAGAAGGCCACCGAGTACCTCACCTA DTNPEISIFTEKINTYIRQKIRSAHHLECATCCAGAACAACCACATCAGCATCTTCAAGG DAQKIKDIIEDVTSNYRKIKSKLSQVNACATCGACATGCTCAACGAGAACCTGGGCGA NALDRIKIKKSEMDTLFESLSKENANNGAAGTCCGTGAGCAGGTACGCCATCGCCAAG YNSAKYFLVDSDKIIKHLEDQVSKMSSATCGAGGAAGCCAACGACCTCTCCGCTCAACT LISYAEREIKELEEKVYShhhhhhCACCGCTGCCGTCAGCGAGTACGAGGCTATCG CCAACTCCATCCGCAAGGAGTTCACCAACATCTCCGACCACACCGAGATGGACACCCTGGAGA ACGAGGCCAAGATGCTAAGGAGCACTACGA 32MSP3.3 [merozoite PVX_097680 MNVA

RGE

VNLKNPNLRNGWSMKN ACCTGAAGAACCCAAACCTCCGCAACGGCTGG surface protein 3 betaLSAQNEENIVHSDGSDDVTDKEEDG AGCATGAAGAACCTGTCCGCCCAAAACGAGG MSP3b)]EVLEGQKGSPKKSAEQKVHAQEEVN AGAACATCGTCCACTCCGACGGCAGCGACGACKESLKSKAQNAKAEAEKAAKAAESAK GTGACCGACAAGGAAGAGGACGGCGAGGTGENTLDALEKVNVPTELNNEKNFAESA CTGGAGGGCCAGAAGGGCAGCCCAAAGAAGTATEAKKQEKISTEAAEEVKEIEVDGQL CCGCCGAGCAAAAGGTCCACGCCCAAGAGGAEKLKNEEEKTAKKARKQEIKTEIAEQA AGTGAACAAGGAGTCCCTCAAGAGCAAGGCCAKAQAAKTEAETAQKDATTAKDEAIK CAAAACGCCAAGCCCTGAGGCTGAGAAGGCTGETGKPKSQNTTKAVTMATEEEKKTK CTAAGGCTGCCGAGTCCGCCAAGGAGAACACDEAQTASEKAGKTAEEAQKEVGKET CCTCGACGCCCTGGAGAAGGTGAACGTCCCAADDDKEVSQLEEEIKELERILKIVKDLA ACCGAGCTCAACAACGAGAAGAACTTCGCTGASEASSASDNAKKAKLKTQIAAEVVKA GAGCGCTGCTACCGAGGCCAAGAAGCAGGAGEKARIEAEEAEKEAGEAKTKTEATEK AAGATCTCCACCGAGGCCGCCGAGGAAGTGAEVLKISDESKAAKVKKAVEKAKEAEK AGGAGATCGAGGTGGACGGCCAACTGGAGAAQAKSEAEKAKGMADDAGGKGTTNLE GCTGAAGAACGAGGAAGAGAAGACCGCCAADVLTKLSEVLTSVKSLASNAEVASKN GAAGGCCAGGAAGCAGGAGATCAAGACCGAAKKEMTKAQIAAEVAKAEKAKIEAEN GATCGCTGAGCAAGCTGCTAAGGCTCAGGCTAKLLADTASKAAENIAKSSKAAKIANN GCTAAGACCGAGGCCGAGACGGCCCAAAAGGVSTIAAEKSKVATEAADEAAKALDETE ACGCCACCACCGCCAAGGACGAGGCCATCAANPESKIAEVTEKATKAVNAAEEAKKE GGAGACGGGCAAGCCAAAGAGCCAGAACACCKAKAEVAVEVAHAEVAKEKAQEAKE ACCAAGGCCGTCACCATGGCCACCGAGGAAGAAKQVADKSKLEKAIQAADKASEKAN AGAAGAAGACCAAGGACGAGGCTCAAACCGCEASKLAEEALSNLESLEKETGEIVEKV TTCCGAGAAGGCTGGCAAGACCGCTGAGGAANAIEQKVQTAKNAAIEAHKEKTKAEIA GCCCAGAAGGAAGTGGGCAAGGAGACGGCCVEVAKAEEAKKEADNAKVAAEKAKET GACGACGACAAGGAAGTGTCCCAACTCGAAGAEKIAKTSKSTEKITEEVRKATEFAKT AGGAGATCAAGGAGCTGGAGAGGATCCTCAAAGDETTLAATKAESEIPSEEKNQKELL GATCGTGAAGGACCTGGCTAGCGAGGCCTCCDSIKQKAESAFQASQEAIKAKTEAEN AGCGCTTCCGACAACGCCAAGAAGGCCAAGC 33hypothetical protein, PVX_001000 mNNYGKLKHGKWDDGSYSERTRWRAGTGGGACGACGGCTCCTACAGCGAGAGGAC conserved MLSGDDHDDLLPSCDSPGGRNDEHCAGGTGGAGGATGCTGTCCGGCGACGACCAC QVNKEVSRTAPSEKVKVVDKETGESGACGACCTCCTCCCATCCTGCGACAGCCCAGG MLVDVGESGGKSSPGVAEESGPSLRCGGCAGGAACGACGAGCACCAAGTCAACAAG GRDVRDVRVDQETRETLQGGATNRRGAAGTGTCCAGGACCGCCCCAAGCGAGAAGG DLTQHGEEETGDDSKRAKQDDEAGVTGAAGGTGGTCGACAAGGAGACCGGCGAGTC RSMLNDTVTAIKDNGSNLLRSVIGQINCATGCTGGTGGACGTGGGCGAGAGCGGCGGC FVQGSAELLKVANEEERQPSGGSVLAAGTCCTCCCCAGGCGTGGCTGAGGAGTCCG SKEGEEATPGDFLGGNNPNGGEKGEGCCCAAGCCTGCGCGGCAGGGACGTGCGCGA LPNGTKNDVMIKGYANVLLNEGKHVLCGTCAGGGTGGACCAAGAGACCCGCGAGACC VGNVRNFLSRVFNLIVREKIMTRMCHCTGCAGGGCGGCGCCACCAACAGGCGCGACC RGGEASIERSGEPVGERSGEPTGERTCACCCAACACGGCGAGGAAGAGACCGGCGA SGDPTGERSGDPTGERSGEPTGERSCGACAGCAAGCGCGCTAAGCAGGACGACGAG GEPTGERSGEPTAERSGEPTAERSDGCTGGCGTCAGGTCCATGCTCAACGACACCGT EPTAERSDEPTADPKGDPTNCRLPKGACCGCCATCAAGGACAACGGCTCCAACCTCC RSATKFYQSEDLYNYYSSLEEMLGKRTGCGCAGCGTCATCGGCCAAATCAACTTCGTG GIRWKTDRVSRYFTFSPSKKIKDNFECAAGGCAGCGCTGAGCTCCTGAAGGTCGCCA EVMNNKVFIESVRSILFDSHKKNKKAVACGAGGAAGAGCGCCAGCCATCCGGCGGCAG FSSFAVVVETLFSLIKEEKVIADMYSYCGTGCTGTCCAAGGAAGGCGAGGAAGCCACC VKLFFQDLDILNLKVLHFLSSSSTENTCCAGGCGACTTCCTCGGCGGCAACAACCCGAA QFVGPPDLSLTNFEYILAKIYSRSVLACGGCGGCGAGAAGGGCGAGCTGCCAAACGG NILSPKMNHSDSKKLSKLLTRRENNLCACCAAGAACGACGTCATGATCAAGGGCTAC KFSFLEGVKMVHSAIPSEGVSAVVLGGCCAACGTGCTCCTGAACGAGGGCAAGCACG NAGGQVNVPIPGADDTLCKFIPIRKKLTCCTCGTGGGCAACGTCCGCAACTTCCTGTCC LYERLSVTRKVAEEVILDYLFRLLLRKAGGGTGTTCAACCTCATCGTCAGGGAGAAGA VHEYVLEhhhhhhTCATGACCAGGATGTGCCACAGGGGCGGCGA GGCTAGCATCGAGAGGTCCGGCGAGCCAGTGGGGGAGCGCTCCGGCGAGCCAACCGGCGAG 34 merozoite surface PVX_097625mGNVSPPNFNDNRVNGNNGNKGNG CAACAGGGTCAACGGCAACAACGGCAACAAG protein 8 (GPI-NDNDVPSFIGGNNNNVNGNNDDNIF GGCAACGGCAACGACAACGACGTGCCAAGCT anchored, C24)NKNGKDVTRNDGDAKDGENRNNKK TCATCGGCGGCAACAACAACAACGTCAACGNENGSGSNENNSIANADNGSGKSDA AACAACGACGACAACATCTTCAACAAGAACGGNANQIDEDGNKMDEASLKKILKIVDE CAAGGACGTGACCCGCAACGACGGCGACGCTMENIQGLLDGDYSILDKYSVKLVDED AAGGACGGCGAGAACCGCAACAACAAGAAGADGETNKRKIIGEYDLKMLKNILLFREKI ACGAGAACGGCTCCGGCAGCAACGAGAACAASRVCENKYNKNLPVLLKKCSNVDDPK CTCCATCGCCAACGCTGACAACGGCTCCGGCALSKSREKIKKGLAKNNMSIEDFVVGLL AGAGCGACGCCAACGCCAACCAAATCGACGAEDLFEKINEHFIKDDSFDLSDYLADFE GGACGGCAACAAGATGGACGAGGCCAGCCTCLINYIIMHETSELIDELLNIIESMNFRLE AAGAAGATCCTGAAGATCGTGGACGAGATGGSGSLEKMVKSAESGMNLNCKMKEDII AGAACATCCAGGGCCTCCTGGACGGCGACTAHLLKKSSAKFFKIEIDRKTKMIYPVQA CTCCATCCTCGACAAGTACAGCGTGAAGCTGGTHKGANMKQLALSFLQKNNVCEHKK TCGACGAGGACGACGGCGAGACGAACAAGACPLNSNCYVINGEEVCRCLPGFSDVK GGAAGATCATCGGCGAGTACGACCTCAAGATIDNVMNCVRDDTLDCSNNNGGCDVN GCTGAAGAACATCCTCCTGTTCAGGGAGAAGATCTLIDKKIVCECKDNFEGDGIYChh ATCTCCCGCGTCTGCGAGAACAAGTACAACAA hhhhGAACCTCCCAGTGCTCCTGAAGAAGTGCAGCA ACGTCGACGACCCAAAGCTCTCCAAGAGCCGCGAGAAGATCAAGAAGGGCCTGGCTAAGAACA ACATGTCCATCGAGGACTTCGTGGTCGGCCTCCTGGAGGACCTGTTCGAGAAGATCAACGAGC ATTCATCAAGGACGACTCCTTCGACCTCAGCGACTACCTGGCCGACTTCGAGCTCATCAACTA CATCATCATGCACGAGACGTCCGAGCTGATCGACGAGCTCCTGAACATCATCGAGAGCATGAAC TTCAGGCTGGAGTCCGGCAGCCTGGAGAAGATGGTGAAGTCCGCCGAGAGCGGCATGAACCT 35 adenylate kinase-like PVX_087110METLLDSETLKNYEKETNEYIRKKKV ATGGAGACGCTCCTGGACTCCGAGACGCTCAAprotein 2, putative EKLFDVILKNVLVNKPENVYLYIYKNIYGAACTACGAGAAGGAGACGAArGAGTACATC (AKLP2) SFLLNKIFVIGPPLLKITPTLCSAIASCFAGGAAGAAGAAGGTGGAGAAGCTCTTCGACG SYYHLSASHMIESYTTGEVDDAAESSTCATCCTCAAGAACGTGCTGGTCAACAAGCCA TSKKLVSDDLICSIVKSNINQLNAKQKGAGAACGTGTACCTGTACATCTACAAGAACAT RGYVVEGFPGTNLQADSCLRHLPSYCTACAGCTTCCTCCTGAACAAGATCTTCGTCAT VFVLYADEEYIYDKYEQENNVKIRSDCGGCCCACCACTCCTGAAGATCACCCCAACCC MNSQTFDENTQLFEVAEFNTNPLKDTCTGCTCCGCCATCGCCTCCTGCTTCAGCTACT EVKVYLRNhhhhhhACCACCTGTCCGCCAGCCACATGATCGAGAGC TACACCACCGGCGAGGTGGACGACGCTGCTGAGTCCAGCACCTCCAAGAAGCTCGTGAGCGAC GACCTGATCTGCTCCATCGTCAAGAGCAACATCAACCAACTCAACGCCAAGCAGAAGAGGGGC TACGTGGTCGAGGGCTTCCCAGGCACCAACCTCCAGGCTGACTCCTGCCTCAGGCACCTGCCAA GCTACGTGTTCGTCCTGTACGCCGACGAGGAGTACATCTACGACAAGTACGAGCAGGAGAACA ACGTGAAGATCAGGTCCGACATGAACAGCCAAACCTTCGACGAGAACACCCAGCTGTTCGAGG TCGCCGAGTTCAACACCAACCCACTCAAGGACGAGGTGAAGGTCTACCTGCGCAACCACCACCA CCACCACCACTGA 36 MSP7-like proteinPVX_082670 mKPGVEKKKKLEEDVIGILRRKLESLQ CTCGAAGAGGACGTCATCGGCATCCTGCGCAKRSLTNSDGKLKKEIELVKKQIQELQK GGAAGCTGGAGTCCCTGCAAAAGAGGTCCCTYEKGEAGKKVDATLGEEPGVESAEE CACCAACAGCGACGGCAAGCTCAAGAAGGAGQPLSVEEAGDTQDEDRLDELEGVED ATCGAGCTGGTCAAGAAGCAAATCCAGGAGCFEEENLEQSEQVEEAEVVEEAEEEA TGCAGAAGTACGAGAAGGGCGAGGCTGGCAAGDAEEEQPAEAEEDGSLLEEAPNSV GAAGGTGGACGCTACCCTGGGCGAGGAGCCGERKAEGAIAEFEEADVEEGAEADEGV GGCGTGGAGTCCGCTGAGGAGCAACCACTGAETDEGADADEASLGSFDLEGELIEED GCGTGGAGGAAGCCGGCGACACCCAGGACGALQESFDLEGEQEEEDLQEGFKSEEE GGACAGGCTCGACGAGCTGGAGGGCGTCGAANQGGQLPREIPPHGEEAVEPPLRG GGACTTCGAGGAAGAGAACCTGGAGCAAAGCNKPSMEYVGNLHSDVGPTEGSANQI GAGCAGGTGGAGGAAGCCGAGGTGGTGGAGSPPSVDEKGKEDGDKYKSASQDGGN GAAGCCGAGGAAGAGGCCGGCGACGCTGAGSVGINNFGGCFQGGNSNGICPLDIFK GAAGAGCAACCGGCTGAGGCTGAGGAAGACKVLEDENFLQEFDSFIHNLYGSSKNN GGCTCCCTCCTCGAAGAGGCCCCAAACAGCGTTPWGGDKMGNENLYMDLFTNALSFL GGAGAGGAAGGCTGAGGGCGCTATCGCTGA NTIEVIhhhhhhGTTCGAGGAAGCCGACGTCGAGGAAGGCGCC GAGGCCGACGAGGGCGTGGAGACGGACGAGGGCGCTGACGCTGACGAGGCTTCCCTGGGCA GCTTCGACCTGGAGGGCGAGCTGATCGAGGAAGACCTCCAGGAGTCTTTCGACCTGGAGGGG GAGCAAGAGGAAGAGGACCTCCAAGAGGGCTTCAAGAGCGAGGAAGAGGCCAACCAAGGCG GCCAGCTGCCAAGGGAGATCCCACCACACGGCGAGGAAGCCGTGGAGCCACCACTCCGCGGC AACAAGCCATCCATGGAGTATGTGGGCAACCTGCACAGCGACGTGGGCCCAACCGAGGGCAGC GCCAACCAAATCTCCCCACCAAGCGTCGACGAGAAGGGCAAGGAAGACGGCGACAAGTACAA 37 high molecular weight PVX_099930mELSHSLSVKNAPDASALNIEVEKDK CGCTCCAGACGCTAGCGCTCTCAACATCGAGGrhoptry protein-2, KKICKNAFQYINVAELLSPREEETYVQTCGAGAAGGACAAGAAGAAGATCTGCAAGAA putative KCEEVLDTIKNDSPDESAEAEINEFILCGCCTTCCAATACATCAACGTCGCCGAGCTCCT SLLHARSKYTIINDSDEEVLSKLLRSINGTCCCCAAGGGAGGAAGAGACTTACGTGCAG GSISEEAALKRAKQLITFNRFIKDKAKAAGTGCGAGGAAGTGCTGGACACCATCAAGA VKNVQEMLVISSKADDFMNEPKQKMACGACAGCCCAGACGAGTCCGCTGAGGCTGA LQKIIDSFELYNDYLVILGSNINIAKRYSGATCAACGAGTTCATCCTCAGCCTCCTGCACG SETFLSIKNEKFCSDHIHLCQKFYEQSCCCGCTCCAAGTACACCATCATCAACGACAGC IIYYRLKVIFDNLVTYVDQNSKHFKKEGACGAGGAAGTGCTGAGCAAGCTCCTGAGGT KLLELLNMDYRVNRESKVHENYVLEDCCATCAACGGCAGCATCTCCGAGGAAGCCGCT ETVIPTMRITDIYDQDRLIVEVVQDGNCTCAAGAGGGCTAAGCAACTGATCACCTTCAA SKLMHGRDIEKREISERYIVTVKNLRKCAGGTTCATCAAGGACAAGGCCAAGGTGAAG DLNDEGLYADLMKTVKNYVLSITQIDNAACGTCCAGGAGATGCTCGTCATCTCCAGCAA DISNLVRELDHEDVEKhhhhhhGGCCGACGACTTCATGAACGAGCCAAAGCAA AAGATGCTCCAGAAGATCATCGACAGCTTCGAGCTGTACAACGACTACCTCGTGATCCTGGGCT CCAACATCAACATCGCCAAGCGCTACTCCAGCGAGACGTTCCTCAGCATCAAGAACGAGAAGTT CTGCTCCGACCACATCCACCTGTGCCAAAAGTTCTACGAGCAGAGCATCATCTACTACAGGCTC AAGGTCATCTTCGACAACCTGGTGACCTACGTCGACCAAAACTCCAAGCACTTCAAGAAGGAG AAGCTCCTGGAGCTCCTGAACATGGACTACAGGGTGAACCGCGAGTCCAAGGTGCACGAGAAC TACGTCCTGGAGGACGAGACTGTGATCCCAACCATGCGCATCACCGACATCTACGACCAAGACA GGCTCATCGTGGAGGTGGTCCAGGACGGCAACAGCAAGCTGATGCACGGCAGGGACATCGAG 38 IMP-specific 5′- PVX_084340MEKLDIPPHEMYEDMQQAFREQDKY GTACGAGGACATGCAACAGGCCTTCAGGGAG nucleotidaseDFLAISDGSVINSYMKKNVVDWNNRY CAAGACAAGTACGACTTCCTGGCCATCTCCGASYNQLKNKDSLIMFLVDIFRSLFLSNCI CGGCAGCGTGATCAACTCCTACATGAAGAAGADKNIDNVLSSIEEMFTDHYYNPMHSR ACGTGGTCGACTGGAACAACAGGTACTCCTACLKYLIDDVGIFFTKLPITKAFHTYNKKY AACCAGCTCAAGAACAAGGACAGCCTCATCATRITKRLYAPPTFNEVRHILNLAQILSLE GTTCCTGGTGGACATCTTCCGCTCCCTCTTCCTDGLDLLTFDADETLYPDGYDFHDEVL GAGCAACTGCATCGACAAGAACATCGACAACASYISSLLKKMNIAIVTAASYSNDAEK GTCCTGTCCAGCATCGAGGAGATGTTCACCGAYQKRLENLLRYFSKHNIEDGSYENFY CCACTACTACAACCCAATGCACAGCAGGCTCAVMGGESNYLFKCNEDANLYSVPEEE AGTACCTGATCGACGACGTGGGCATCTTCTTCWYHYKKYVNKETVEQILDISQKCLQQ ACCAAGCTCCCAATCACCAAGGCCTTCCACACVITDFKLCAQIQRKEKSIGLVPNKIPSA CTACAACAAGAAGTACAGGATCACCAAGCCCNNQKEQKNYMIKYEVLEEAVIRVKKEI TGTACGCCCCACCAACCTTCAACGAGGTCCGCVKNKITAPYCAFNGGQDLWVDIGNKA CACATCCTCAACCTGGCCCAAATCCTCTCCCTGEGLIILQKLLKIEKKKCCHIGDQFLHSG GAGGACGGCCTCGACCTCCTGACCTTCGACGCNDFPTRFCSLTLWISNPQETKACLKSI CGACGAGACGCTGTACCCAGACGGCTACGACMNLNMKSFIPEVLYENEhhhhhh TTCCACGACGAGGTGCTCGCCAGCTACATCTCCAGCCTCCTGAAGAAGATGAACATCGCCATCG TCACCGCCGCCTCCTACAGCAACGACGCCGAGAAGTACCAGAAGAGGCTGGAGAACCTCCTGC GCTACTTCTCCAAGCACAACATCGAGGACGGCAGCTACGAGAACTTCTACGTGATGGGCGGCG AGTCCAACTACCTCTTCAAGTGCAACGAGGACGCCAACCTGTACAGCGTCCCAGAGGAAGAGT GGTACCACTACAAGAAGTATGTGAACAAGGAGACGGTCGAGCAAATCCTCGACATCTCCCAGA AGTGCCTGCAACAAGTGATCACCGACTTCAAGCTCTGCGCCCAAATCCAGAGGAAGGAGAAGT 39 subpellicular PVX_098915MEIIAEKPKVKFNFASEEYKNCDSSD AGTTCAACTCGCCTCCGAGGAGTACAAGAACmicrotubule protein 1, YSECAEDYGRPNGKDYFYANRILSLDTGCGACTCCAGCGACTACTCCGAGTGCGCTGA putative (SPM1)RNSEQRRKESPSKRPGLCVDEICTC GGACTACGGCAGGCCAAACGGCAAGGACTACGFHRCPKIVKSLPFDGESNYRSEFGP TTCTACGCCAACAGGATCCTCTCCCTGGACCGKPLPELPPRQEAKLTRSLPFEGESNY CAACAGCGAGCAGAGGCGCAAGGAGTCCCCARSEFGPKPLPELPPRVEQKPPKSLPF AGCAAGAGGCCAGGCCTCTGCGTGGACGAGADGESNYRSEFGPKPLPELPPRVEQK TCTGCACCTGCGGCTTCCACCGCTGCCCAAAGPPKSLPFDGESNYRSEFGPKPLPELP ATCGTCAAGTCCCTGCCATTCGACGGCGAGTCPRVEQKPPKSLPFEGESNYRSEFGP CAACTACCGCAGCGAGTTCGGCCCAAAGCCACKPLPELPPRVEQKPPKSLPFEGESNY TCCCAGAGCTGCCACCAAGGCAAGAGGCCAARSEFGPKALPELPPRVEQKPPKSLPF GCTCACCCGCAGCCTGCCATTCGAGGGCGAGTEGESNYRSEFGPKPLPALPPRVETKL CCAACTACAGGTCCGAGTTCGGGCCTAAGCCTVKSLPFEGESNYRSEFGPKPLPELPP CTGCCTGAGCTGCCACCACGCGTGGAGCAAARVEQKPPKSLPFEGESNYRSEFGPK AGCCACCAAAGTCCCTCCCTTTCGATGGGGAGPLPALPPRVVTKLVKSLPFEGESNYR AGCAACTACAGGAGTGAATTCGGGCCTAAGCSEFGPKPLPEIPPRVEQKPPKSLPFE CGCTGCCCGAGCTGCCACCACGCGTCGAGCAGESNYRSEFGPKPLPELPPRVEQKP GAAGCCACCAAGAGCCTCCCTTTCGATGGCGPKSLPFEGESNYRSEFGPKQLPELPP AGAGCAACTACAGGAGCGAATTTGGGCCTAARQEAKLTRSLPFEGESSYRSEYVRKA GCCGCTGCCGGAACTGCCACCACGCGTGGAAIPICPVNLLPKYPAPTYPSEHVFWDSA CAAAAGCCACCAAAGAGCCTGCCTTTCGAGGG CKRWYhhhhhhGGAGTCCAACTACAGGAGTGAGTTTGGGCCT AAGCCGTTGCCTGAACTGCCACCACGCGTCGAACAGAAACCACCAAAAAGCCTCCCTTTCGAGG GCGAGAGCAACTACCGCTCCGAGTTCGGCCCAAAGGCTCTGCCGGAGCTGCCACCACGCGTGG AACAGAAACCACCAAAGAGCCTCCCCTTCGAGGGGGAGAGCAATTATCGCTCTGAGTTCGGGC CAAAGCCGCTGCCGGCTCTGCCACCACGCGTG 40tryptophan-rich antigen PVX_088820 mAAANRPNANGFVSPTLIGFGELSIQATGGCTGCCGCCAACAGGCCAAACGCCAACG (Pv-fam-a) ESEEFKRMAWNNWMLRLESDWKHFGCTTCGTCTCCCCAACCCTCATCGGCTTCGGCG NDSVEEAKTKWLHERDSAWSDWLRAGCTGTCCATCCAAGAGAGCGAGGAGTTCAA SLQSKWSHYSEKMLKEHKSNVMEKSGAGGATGGCCTGGAACAACTGGATGCTCCGC ANWNDTQWGNWIKTEGRKILEAQWCTGGAGTCCGACTGGAAGCACTTCAACGACA EKWIKKGDDQLQKLIWKWVQWKNDGCGTGGAGGAAGCCAAGACCAAGTGGCTGCA KIRSWLSSEWKTEEDYYWANVERATCGAGAGGGACTCCGCTTGGAGCGACTGGCTC TAKWLQEAEKMHWLKWKERINRESECGCTCCCTGCAGAGCAAGTGGTCCCACTACAG QWVNWVQMKESVYINVEWKKWPKCGAGAAGATGCTGAAGGAGCACAAGTCCAAC WKNDKKILFNKWSTNLVYKWTLKKQGTCATGGAGAAGAGCGCCAACTGGAACGACA WNVWIKEANTAPQVhhhhhhCCCAATGGGGCAACTGGATCAAGACCGAGGG CCGCAAGATCCTGGAGGCCCAGTGGGAGAAGTGGATCAAGAAGGGCGACGACCAACTGCAGA AGCTCATCCTGGACAAGTGGGTCCAGTGGAAGAACGACAAGATCAGGTCCTGGCTCTCCAGCG AGTGGAAGACCGAGGAAGACTACTACTGGGCTAACGTGGAGAGGGCTACCACCGCTAAGTGG CTCCAAGAGGCCGAGAAGATGCACTGGCTGAAGTGGAAGGAGAGGATCAACCGCGAGTCCGA GCAATGGGTGAACTGGGTCCAGATGAAGGAGAGCGTGTACATCAACGTCGAGTGGAAGAAGT GGCCAAAGTGGAAGAACGATAAGAAGATCCTGTTCAACAAGTGGAGCACCAACCTCGTGTACA AGTGGACCCTGAAGAAGCAGTGGAACGTCTGGATCAAGGAAGCCAACACCGCCCCACAGGTG CACCACCACCACCACCACTGA 41 PvTRAP/SSP2PVX_082735 mEKVVDEVKYSEEVCNESVDLYLLVD GCGAGGAAGTGTGCAACGAGTCCGTCGACCTGSGSIGYPNWITKVIPMLNGLINSLSL CTACCTCCTGGTGGACGGCTCCGGCAGCATCGSRDTINLYMNLFGNYTTELIRLGSGQS GCACCCAAACTGGATCACCAAGGTCATCCCAIDKRQALSKVTELRKTYTPYGTTNMT ATGCTCAACGGCCTGATCAACTCCCTCAGCCTAALDEVQKHLNDRVNREKAIQLVILM GTCCCGCGACACCATCAACCTCTACATGAACCTDGVPNSKYRALEVANKLKQRNVSL TGTTCGGCAACTACACCACCGAGCTCATCAGGAVIGVGQGINHQFNRLIAGCRPREPN CTGGGCAGCGGCCAATCCATCGACAAGCGCCCKFYSYADWNEAVALIKPFIAKVCTEV AGGCCCTCAGCAAGGTGACCGAGCTGAGGAAERVANCGPWDPWTACSVTCGRGTH GACCTACACCCCATACGGCACCACCAACATGASRSRPSLHEKCTTHMVSECEEGECP CCGCCGCCCTCGACGAGGTGCAAAAGCACCTVEPEPLPVPAPLPTVPEDVNPRDTDD GAACGACAGGGTCAACCGCGAGAAGGCCATCENENPNFNKGLDVPDEDDDEVPPAN CAGCTCGTGATCCTGATGACCGACGGCGTCCCEGADGNPVEENVFPPADDSVPDESN AAACAGCAAGTACCGCGCCCTGGAGGTGGCCVLPLPPAVPGGSSEEFPADVQNNPD AACAAGCTGAAGCAAAGGAACGTCTCCCTGGSPEELPMEQEVPQDNNVNEPERSDS CCGTGATCGGCGTGGGCCAAGGCATCAACCANGYGVNEKVIPNPLDNERDMANKNK CCAGTTCAACAGGCTGATCGCTGGCTGCAGGCTVHPGRKDSARDRYARPHGSTHVNN CACGCGAGCCAAACTGCAAGTTCTACAGCTACNRANENSDIPNNPVPSDYEQPEDKA GCTGACTGGAACGAGGCTGTGGCTCTCATCAAKKSSNNGYKhhhhhh GCCATTCATCGCCAAGGTCTGCACCGAGGTGGAGAGGGTGGCTAACTGCGGCCCATGGGACCC GTGGACCGCTTGCTCCGTGACCTGCGGCAGGGGCACCCACAGCAGGTCCCGCCCAAGCCTGCA CGAGAAGTGCACCACCCACATGGTGTCCGAGTGCGAGGAAGGCGAGTGCCCAGTGGAGCCAG AGCCACTGCCGGTCCCAGCCCCACTGCCAACCGTGCCAGAGGACGTCAACCCAAGGGACACCG ACGACGAGAACGAGAACCCAAACTTCAACAAGGGCCTCGACGTGCCAGACGAGGACGACGAC 42 MSP7-like protein PVX_082645mDDKKDKENEHKEDADKKNNDELKT CACAAGGAAGACGCCGATAAGAAGAACAACGLKGKLQKIRVQIKDDKLPQKISEEQIS ACGAGCTCAAGACCCTGAAGGGCAAGCTCCAVLKKKLEDFKNLKSEHEAKLASEKGD AAAGATCAGGGTGCAGATCAAGGACGACAAGTSAGGEGELGLSDKEFVGQNVKANG CTGCCACAAAAGATCTCCGAGGAGCAGATCADAAGVSGEQGASGGSGQGEAGPSS GCGTCCTCAAGAAGAAGCTGGAGGACTTCAAPADEQDDDNEAVQWGPATEEVVAE GAACCTCAAGTCCGAGCACGAGGCCAAGCTGAMSDEGPQEQGAEGGPSNPTDDQA GCCTCCGAGAAGGGCGACACCTCCGCCGGCGEEATPGPSKPASGASGSQGASDSSN GCGAGGGCGAGCTGGGCCTGTCCGACAAGGADSAEPTSAAAAAAPAGPTAAAASPO GTTCGTGGGCCAAAACGTCAAGGCCAACGGCVKHVDTLCDELLAGENKKNVLDEGE GACGCCGCCGGCGTGAGCGGCGAGCAAGGCDHSQYNIFRKQYDKMVLNKTEYNISL GCCTCCGGCGGCAGCGGCCAGGGCGAGGCTGKLLDTMLTNGQVEREKKNTLIKTFKK GCCCATCCAGCCCAGCCGACGAGCAAGACGAALYDKQYSEKLRNLISGVYAFAKRNN CGACAACGAGGCTGTCCAGTGGGGCCCAGCTFIDGDKWEGDYSKLFEYIGCMMNTL ACCGAGGAAGTGGTGGCTGAGGCTATGTCCG ELhhhhhhACGAGGGCCCACAAGAGCAGGGCGCTGAGG GCGGCCCAAGCAACCCAACCGACGACCAAGCTGAGGAAGCCACCCCAGGCCCATCCAAGCCA GCTTCCGGCGCTTCCGGCAGCCAGGGCGCTTCCGACTCCAGCAACGACTCCGCCGAGCCAACCA GCGCTGCCGCCGCCGCCGCCCCAGCTGGCCCAACCGCTGCCGCCGCCAGCCCACAGGTGAAGC ACGTGGACACCCTCTGCGACGAGCTCCTGGCTGGCGAGAACAAGAAGAACGTGCTGGACGAG GGCGAGGACCACTCCCAATACAACATCTTCAGGAAGCAGTACGACAAGATGGTCCTCAACAAG ACCGAGTACAACATCAGCCTCAAGCTCCTGGACACCATGCTGACCAACGGCCAAGTGGAGCGC GAGAAGAAGAACACCCTCATCAAGACCTTCAA 43early transcribed PVX_111065 mKRHA

RGALHSLKS

EHEVQRKKNK ACTCCCTGAAGAGCATCGAGCACGAGGTGCA membrane protefnKKKIILYSIGSILALAAVIATGVGIGMYI AAGGAAGAAGAACAAGAAGAAGAAGATCATC(etramp 10.2) KKKKKNSLEKLQQIEPQKLESKTDESCTCTACTCCATCGGCAGCATCCTGGCTCTGGCT DPLLGKSEAAKVEVKGDSEEVPQEVGCCGTGATCGCTACCGGCGTCGGCATCGGCAT SSPSEALDVEPPVSEALNMEPAVGEGTACATCAAGAAGAAGAAGAAGAACAGCCTG SANFEDSAKGEVDIEPVSEVESIEPVSGAGAAGCTGCAACAGATCGAGCCACAAAAGC EVESIEPVSEVESIEPSVDEVMDAAETGGAGTCCAAGACCGACGAGAGCGACCCACT PISTEPVNVEPAGNETENIVPTSFEQVCCTGGGCAAGAGCGAGGCTGCTAAGGTGGAG NIEPAVSEAFSQERSGEETADFEDSVGTCAAGGGCGACTCCGAGGAAGTGCCACAAG KEDVIPESPPVESVTIEAENIQPMNVEAGGTGTCCATCCCGAGCGAGGCTCTGGACGT QMNVDPTVSDAESIEPTPVEAVDIEPGGAGCCACCAGTCTCCGAGGCCCTGAACATG VNVEPVNVEPAVSETMSQEPSLDEVGAGCCAGCCGTGGGCGAGTCCGCCAACTTCG ENVESAVNEMMSQEPSAEETANFAHAGGACAGCGCCAAGGGCGAGGTCGACATCGA SIKEDVSPESTSVESLDVESSVSEPMGCCAGTGTCCGAGGTCGAGTCTATTGAACCAG STDPSPVESVSMESVDSETVNVESIDTGTCCGAGGTGGAGTCTATTGAGCCAGTGTCC SETVNVEPSDETSKVEADVQQFTDEGAAGTCGAGAGCATCGAGCCATCCGTGGACG ELSTIGNVADKASDGPAPEASDFPDSAGGTCATGGACGCTGCTGAGCCAATCAGCACC IFEENLDNANPPLKLEDALVDPPASDGAGCCAGTGAACGTCGAGCCAGCCGGCAACG EAQPEPSHPNEAVGAAKSAESAEADAGACGGAGAACATCGIGCCAACCTCCTTCGAG QISHSGSGDASPSAPSSSDDTSGSKCAAGTGAACATCGAGCCAGCCGTCAGCGAGG NSGTSGKDRLFKTYDSDVEPPIVPEKCCTTCTCCCAAGAGAGGAGCGGCGAGGAGAC YPTVGVKEAPKMGFAEMAFKNIFDTFGGCTGACTTCGAGGACTCCGTGAAGGAAGAC SKVADASKVLTPEKQSAPEKQSAPEKGTCATCCCAGAGTCCCCACCAGTGGAGAGCGT QSAPEKQSAPEKHSTPPKQSTSPKECACCATCGAGGCCGAGAACATCCAACCGATGA STSPKQPAPPKPSTSPKQSAPAKQSACGTGGAGCAGATGAACGTGGACCCAACCGT APPKQSAPAKQSAPAKNAAPPQSASCTCCGACGCCGAGAGCATCGAGCCAACCCCAG SSRFFSSSSNGNKGFGLRLFSDASSSTGGAGGCCGTGGATATCGAGCCTGTCAACGT NNKKGRAGNPIIRFKRRANhhhhhhGGAGCCTGTCAACGTTGAGCCAGCCGTGTCCG 44 hypothetical protein, PVX_091500MNNPAEVVAAHLRRTGNSNEIRQAS ACCTGAGGCGCACCGGCAACTCCAACGAGATC conservedHVESVGGSANSSLDDDDGGGYDSAA AGGCAGGCTAGCCACGTGGAGAGCGTCGGCGPPGELHTTGDAPPGEFRTTGVVPPG GCTCCGCTAACTCCAGCCTCGACGACGACGACRQKGGKKRMFKIKKKKSLTPLHIDDG GGCGGCGGATACGACAGCGCMCCCCACCAGGFTQGGEAKGPDVALESFAITRKRRR GCGAGCTCCACACCACCGGCGACGCCCCACCAPPLLGRGVVESSNIELTSKLGGKLGS GGCGAGTTCCGCACCACCGGCGTGGTCCCACCKLGGKLNPTLSLVASRAVDGLLGGVH AGGCAGGCAAAAGGGCGGCAAGAAGCGCATKHMQGPFSLDLDGTNNSPLATPIVTP GTTCAAGATCAAGAAGAAGAAGTCCCTCACCCNLYSNISTPFNMHNGIPPSAPAPMAL CACTGCACATCGACGACGGCGGCTTCACCCAGPPQGVQVPLPNAQPQPPPSVATTAT GGCGGCGAGGCTAAGGGCCCAGACGTGGCTCAAPAATSPMASPTTPTPAASTGVPPP TGGAGTCCTTCGCCATCACCAGGAAGAGGCGPGIQLATNAMTYPQMNMQNVMTANQ CAGGCCACCACTCCTGGGCCGCGGCGTGGTCMAQNPAFNIHPTATNLRDDPGNVNY GAGTCCAGCAACATCGAGCTCACCAGCAAGCTNEVVTITIGIVICLFLFCFVFGCIVKMC GGGCGGCAAGCTCGGCTCCAAGCTGGGCGGCKPAKRRRhhhhhh AAGCTCAACCCGACCCTCAGCCTGGTGGCCTCCAGGGCCGTGGACGGCCTCCTGGGCGGCGTG CACAAGCACATGCAAGGCCCATTCAGCCTCGACCTGGACGGCACCAACAACTCCCCACTGGCCA CCCCAATCGTCACCCCAAACCTCTACTCCAACATCAGCACCCCATTCAACATGCACAACGGCATC CCACCAAGCGCTCCAGCTCCAATGGCTCTGCCACCACAAGGCGTGCAGGTCCCACTCCCAAACG CCCAACCACAACCACCACCATCCGTGGCTACCACCGCTACCGCTGCTCCAGCTGCTACCAGCCC AATGGCTTCCCCAACCACCCCAACCCCAGCTGCTAGCACCGGCGTGCCACCACCACCAGGCATC CAGCTGGCCACCAACGCCATGACCTACCCACAGATGAACATGCAGAACGTCATGACCGCCAACC 45 hypothetical protein, PVX_090145mSKTGNNNRNAKNAKGGGGGGKRG CCAAGAACGCTAAGGGCGGCGGCGGCGGCG conservedNNEANKNDGMSGKGSQKGKKKDPG GCAAGAGGGGCAACAACGAGGCCAACAAGAGGGTPKGQGKGPEQGKQKNKKGED ACGACGGCATGTCCGGCAAGGGCAGCCAAAASHFDEYIKDMKNSQDEDNFMDELNR GGGCAAGAAGAAGGACCCAGGCGGCGGCGGFEKNFHDEDFESDENLFNYGKGGTH CACCCCGAAGGGCCAGGGCAAGGGCCCAGAGSGEFNKIGELNSGNYNEMKPDANDY CAAGGCAAGCAGAAGAACAAGAAGGGCGAGQYFDNEDILEGDEDLTNIWNKNMQNF GACTCCCACTCGACGAGTACATCAAGGACATEPSTLLTFEIQGNSEEYLFEEVTSLNT GAAGAACAGCCAAGACGAGGACAACTTCATGYFRGVFYSNNESDDNKILFFITDPDGE GACGAGCTCAACAGGTTCGAGAAGAACTTCCAVIYKKEASEGIFYFYTQKIGVYTITLKN  CGACGAGGACTTCGAGTCCGACGAGAACCTGSKWMGKKLTTVALGLGESPSLKSEHI TTCAACTACGGCAAGGGCGGCACCCACTCCGGKDFTNYIDKIVAETKRLKNELKYLSSK CGAGTTCAACAAGATCGGCGAGCTCAACAGCHMTHIEKMKKITNKAFLYCFIKLFVLVF GGCAACTACAACGAGATGAAGCCAGACGCCALSLFTIYYIKNLVSNKRVLhhhhhh ACGACTACCAGTACTTCGACAACGAGGACATCCTGGAGGGCGACGAGGACCTGACCAACATCT GGAACAAGAACATGCAAAACITCGAGCCAAGCACCCTCCTGACCTTCGAGATCCAGGGCAACT CCGAGGAGTACCTCTTCGAGGAAGTGACCAGCCTGAACACCTACTTCCGCGGCGTCTTCTACTC CAACAACGAGAGCGACGACAACAAGATCCTGTTCTTCATCACCGACCCAGACGGCGAGGTCAT CTACAAGAAGGAAGCCTCCGAGGGCATCTTCTACTTCTACACCCAAAAGATCGGCGTGTACACC ATCACCCTCAAGAACAGCAAGTGGATGGGCAAGAAGCTGACCACCGTGGCTCTGGGCCTGGG CGAGTCCCCAAGCCTCAAGAGCGAGCACATCAAGGACTTCACCAACTACATCGACAAGATCGTC GCCGAGACGAAGAGGCTGAAGAACGAGCTCA 46hypothetical protein, PVX_119265 MNNHQAVKQQMNPKGSKEQNRMVAGAACCCAAAGGGCTCCAAGGAGCAGAACAGG conserved PNSNMPGGMRDLAYHRNNGNNEMGATGGTGGCCCCAAACAGCAACATGCCAGGCG KMNMNANGQQHNAGSSNTYNSNSINGCATGAGGGACCTCGCTTACCACAGGAACAAC NNNYSLGLYIDNPQNAFVFDENDLKTGGCAACAACGAGATGGGCAAGATGAACATGA LFSHYKGAKNIRILNDKAAAQITFNDKACGCCAACGGCCAACAGCACAACGCCGGCTCC NMIQQVRKDINGLTITDIGTIRCIILNEGAGCAACACCTACAACTCCAACTCCATCAACAA KIVEQFLPFSANDPASAQQKGGSNQCAACAACTACTCCCTCGGCCTGTACATCGACA SGDSTVDMLKKLANLLQPERAMDSSACCCACAAAACGCCTTCGTCTTCGACGAGAAC MAPKMGDNGGLSATGSVNMGASIATGACCTCAAGACCCTGTTCAGCCACTACAAGGG NVGMGGNMPTNANMGGVITTNANVSCGCCAAGAACATCAGGATCCTCAACGACAAG ANVSANVSANPMPGKNQVKNKMGNGCTGCCGCCCAGATCACCTTCAACGACAAGAA HAIYNNGGSHFNQAHMNKGEPGENNCATGATCCAACAGGTCAGGAAGGACATCAAC PYATKRLSRIELIDIFGFPVEFDVMKKIGGCCTGACCATCACCGACATCGGCACCATCCG LGKNNSNISYIKEQTNNSVSIEIKGKPCTGCATCATCCTCAACGAGGGCAAGATCGTGG FNEAPIVERMHVSVSSDDLIGYKKATAGCAATTCCTGCCATTCTCCGCCAACGACCCG ELIVKLLNSIFEEFYDFCYEKNYPVPEGCTAGCGCTCAACAGAAGGGCGGCTCCAACC NLSFKRHEYMYNPDGSTKYVGFKDKAAAGCGGCGACTCCACCGTGGACATGCTCAA WHVMKDSYRTDYSFRKNKGLQKNDGAAGCTCGCTAACCTCCTGCAGCCAGAGAGG KDKRMHGGAFGGHPNLSIGYANQNAGCCATGGACTCCAGCATGGCCCCAAAGATGG PQGDFKEMNhhhhhhGCGACAACGGCGGCCTCTCCGCTACCGGCTCC GTCAACATGGGCGCCTCCATCGCCACCAACGTGGGCATGGGCGGCAACATGCCAACCAACGCC AACATGGGCGGCGTCATCACCACCAACGCCAACGTGAGCGCCAACGTCTCCGCTAACGTGAGCG CTAACCCAATGCCAGGCAAGAACCAAGTGAAGAACAAGATGGGCAACCACGCCATCTACAACA ACGGCGGCTCCCACTTCAACCAGGCCCACATGAACAAGGGCGAGCCAGGCGAGAACAACCCAT 47 rhoptry neck protein 2, PVX_117880mREAKGSVRDGKQYVKTKSPTYTPQ ATGCGCGAGGCTAAGGGCTCCGTGCGCGACGputative (RON2) KKTKVIFYMPGQEQEEEEDDNDPNGGCAAGCAATACGTCAAGACCAAGAGCCCAAC SKKNGKSDTGANKGTHMGSKTDAGCTACACCCCACAGAAGAAGACCAAGGTCATCT NSPSGLNKGSGVGSGSRPASNNYKGTCTACATGCCAGGCCAAGAGCAAGAGGAAGA NAGGGINIDMSPHGDNSNKGQQGNAGGAAGACGACAACGACCCAAACGGCTCCAAG GLNKNQEDTLRDEYEKIRKQEEEEEEAAGAACGGCAAGAGCGACACCGGCGCCAACA RINNQRRADMKRAQRGKNKFGDDKAGGGCACCCACATGGGCTCCAAGACCGACGC GVQDShhhhhhTGGCAACTCCCCGAGCGGCCTCAACAAGGGCT CCGGCGTGGGCTCCGGCAGCAGGCCAGCCAGCAACAACTACAAGGGCAACGCCGGCGGCGGC ATCAACATCGACATGTCCCCACACGGCGACAACAGCAACAAGGGCCAACAGGGCAACGCCGGC CTCAACAAGAACCAAGAGGACACCCTGAGGGACGAGTACGAGAAGATCCGCAAACAAGAGGA AGAGGAAGAGGAGCGCATCAACAACCAAAGGCGCGCTGACATGAAGAGGGCTCAGAGGGGCA AGAACAAGTTCGGCGACGACAAGGGCGTGCAAGACAGCCACCACCACCACCACCACTGA 48 tryptophan-rich antigen PVX_121897mSSQSAVDYIEQEPLDILNLEEGDLE ATGTCCAGCCAAAGCGCCGTGGACTACATCGA (Pv-fam-a)VTEQWKDNEWHNWKLKLEEDWDSF GCAGGAGCCACTCGACATCCTCAACCTCGAAGSTSLIRDKKDFMKIKTDELNGWLNLE AGGGCGACCTGGAGGTCACCGAGCAGTGGAAENKWNNFSGYLSDGYKNYLLKKSEK GGACAACGAGTGGCACAACTGGAAGCTCAAGWNDADWENWANTEMVAHLDKDYHL CTCGAAGAGGACTGGGACTCCTTCAGCACCTCWSLNTERSVNALVRGEWNQWQHDK CCTCATCAGGGACAAGAAGGACTTCATGAAGMSSWLSSDWKKVGAMYWDLQESR ATCAAGACCGACGAGCTGAACGGCTGGCTCANWASYSHTDDMKEHWIKWNDRNAR ACCTGGAGGAGAACAAGTGGAACAACTTCAGENIEWSKWVQNKEYFIMYARHSDIEQ CGGCTACCTCTCCGACGGCTACAAGAACTACCWKYDNYALYSTWRNDFINRWVSEKK TCCTGAAGAAGTCCGAGAAGTGGAAGGACGC WNSILNhhhhhhCGACTGGGAGAACTGGGCCAACACCGAGATG GTGGCCCACCTCGACAAGGACTACCACCTCTGGAGCCTGAACACCGAGAGGTCCGTGAACGCT CTGGTCCGCGGCGAGTGGAACCAATGGCAGCACGACAAGATGTCCAGCTGGCTCTCCAGCGAC TGGAAGAAGGTCGGCGCCATGTACTGGGACCTGCAGGAGAGCAGGAACTGGGCCAGCTACTC CCACACCGACGACATGAAGGAGCACTGGATCAAGTGGAACGACAGGAACGCCCGCGAGAACA TCGAGTGGTCCAAGTGGGTGCAAAACAAGGAGTACTTCATCATGTACGCCCGCCACAGCGACA TCGAGCAGTGGAAGTACGACAACTACGCCCTCTACTCCACCTGGAGGAACGACTTCATCAACCG CTGGGTCAGCGAGAAGAAGTGGAACTCCATCCTGAACCACCACCACCACCACCACTGA 49 tryptophan-rich antigen PVX_125728mKSSNEIERLTHVKLKDTSEWTENVE ATGAAGTCCAGCAACGAGATCGAGAGGCTCA (Pv-fam-a)EWVKDEWHEWMDEVQMDWKEFNS CCCACGTGAAGCTGAAGGACACCTCCGAGTGSLESEKNKWFGKKEKEMMELIKSIED GACCGAGAACGTGGAGGAGTGGGTCAAGGAKWLDFNENMHEVLNYAILKISLMWSF CGAGTGGCACGAGTGGATGGACGAGGTCCAGSEWQKWINKDGKRIIENQWERWTIS ATGGACTGGAAGGAGTTCAACTCCAGCCTGGNKNLYYKIIMKEWFKWKNKKIKQWLK AGTCCGAGAAGAACAAGTGGTTCGGCAAGAARNWLHHEGRILENWERLPYTKILAMS GGAGAAGGAGATGATGGAGCTGATCAAGAGCEKKPWFNSNAQVINERDYFLIWIKKK ATCGAGGACAAGTGGCTCGACTTCAACGAGAEDFLVNEERDKWENWEYYKNDFFQT ACATGCACGAGGTGCTCAACTACGCCATCCTCWMDSFLSHWLNIKKRDILHSQShhhh AAGATCTCCCTGATGTGGTCCTTCAGCGAGTG hhGCAAAAGTGGATCAACAAGGACGGCAAGAGG ATCATCGAGAACCAGTGGGAGCGCTGGACCATCAGCAACAAGAACCTGTACTACAAGATCATC ATGAAGGAGTGGTTCAAGTGGAAGAACAAGAAGATCAAGCAATGGCTCAAGAGGAACTGGCT GCACCACGAGGGCAGGATCCTGGAGAACTGGGAGCGCCTGCCATACACCAAGATCCTCGCCAT GTCCGAGAAGAAGCCATGGTTCAACAGCAACGCCCAAGTGATCAACGAGAGGGACTACTTCCT GATCTGGATCAAGAAGAAGGAAGACTTCCTCGTCAACGAGGAGCGCGACAAGTGGGAGAACT GGGAGTACTACAAGAACGACTTCTTCCAAACCTGGATGGACTCCTTCCTCAGCCACTGGCTGAA CATCAAGAAGCGCGACATCCTCCACTCCCAGAGCCACCACCACCACCACCACTGA 50 reticulocyte binding PVX_090330mRLKHDHNLLPNYANLMRDDQNGQ ATGAGGCTCAAGCACGACCACAACCTCCTGCCprotein 2 precursor NSENRGDNINNHNKNHNDQNNHNGAAACTACGCCAACCTGATGAGGGACGACCAA (PvRPB-2), putativeNNDNSINSEYLKTSHLQNSSAMVHLN AACGGCCAGAACTCCGAGAACCGCGGCGACADHKITTKPARYSYIQRSKIYAFNPNNK ACATCAACAACCACAACAAGAACCACAACGACKIENNNELHShhhhhh CAAAACAACCACAACGGCAACAACGACAACTCCATCAACAGCGAGTACCTCAAGACCAGCCACC TGCAGAACTCCAGCGCCATGGTGCACCTCAACGACCACAAGATCACCACCAAGCCAGCCAGGTA CTCCTACATCCAACGCAGCAAGATCTACGCCTTCAACCCAAACAACAAGAAGATCGAGAACATCA ACAACGAGCTGCACTCCCACCACCACCACCAC CACTGA51 histone-lysine N- PVX_123685 mSMEQGTPIVFPHKEGTILTKGTNNLCCCACACAAGGAAGGCACCATCCTCACCAAGG methyltransferase, H3AVAHKEEVHRSEEETTLKGLKEELPH GCACCAACAACCTGGCCGTGGCCCACAAGGAlysine-4 specific, EHTLAIQKYDPSFGRGGSPGSGSTEAGAGGTGCACAGGAGCGAGGAAGAGACGAC putative (SET10)HTNGSFSNSYETILYNKSNDVVKNLK CCTCAAGGGCCTGAAGGAAGAGCTCCCACACEIKKGAPFGGVISDAVSCPASSSSNT GAGCACACCCTGGCCATCCAGAAGTACGACCCGGNKNLCFSNMMKLSKKILGFPLLTD AAGCTTCGGCCGCGGCGGCTCCCCAGGCAGCFERGMSTNQPCLPLSDHLKRLSVCT GGCAGCACCGAGCACACCAACGGCTCCTTCAGVCYSKHNDLAKAIICRVTKMHFEANY CAACTCCTACGAGACGATCCTCTACAACAAGTNDGLGDEDMFKTSSECIQSVIRELAN CCAACGACGTGGTCAAGAACCTGAAGGAGATTIKEYRKRELSGAYVQELARSGSSSY CAAGAAGGGCGCTCCATTCGGCGGCGTGATCRSCSSSSYSSRGGSCAGSRGDGLA TCCGACGCCGTCTCCTGCCCGGCCGCCAGCTCCGSHGEIHAVIAGPPLTDDHNDIGAEA AGCAACACCGGCGGCAACAAGAACCTCTGCTTHSPSSSLKLPPQKPFYGMMSDPPCS CAGCAACATGATGAAGCTCTCCAAGAAGATCCDRRPGDTNNPFENNTPPLLWDNKVN TGGGCTTCCCACTCCTGACCGACTTCGAGAGGYTDDYTCKRGEVNSTLGKRPHEEDN GGCATGAGCACCAACCAACCATGCCTCCCACTKGSSQKKSKLRTKPSNDTIGGENGD GAGCGACCACCTCAAGCGCCTGTCCGTGTGCASLKGGTDEGKTHEGGGNVGSCTAQ CCGTCTGCTACAGCAAGCACAACGACCTGGCCGGADQLPRSDLCRDPRGDPCVDPLP AAGGCCATCATCTGCAGGGTGACCAAGATGCEQHAHRSKDENQKGDKNDIHFAGEK ACTTCGAGGCCAACTACAACGACGGCCTCGGCLDEIEAPGDQKGNYVTLENISKASNFI GACGAGGACATGTTCAAGACCTCCAGCGAGTPLLGVELGSTKIQREFTNGTYVGTVT GCATCCAATCCGTGATCCGCGAGCTGGCCAACEQIKDEHGNPFFVVTYEDGDAEWMT ACCATCAAGGAGTACAGGAAGCGCGAGCTGTPCFLFQELLKQSTNSVDYPLATTFKE CCGGCGCCTACGTCCAAGAGCTCGCTAGGTCCVFNPEFKKDLKLSNCSLELKIERRKRK GGCTCCAGCTCCTACAGGAGCTGCAGCTCCAGSNCESASNNNSVSKRQKHAQEENSS CTCCTACAGCTCCAGGGGCGGCAGCTGCGCTG RKKKQRFhhhhhhGCTCCCGCGGCGACGGCCTCGCCGGCTCCCAC GGCGAGATCCACGCCGTCATCGCTGGCCCACCACTGACCGACGACCACAACGACATCGGCGCTG 52 reticulocyte binding PVX_125738 m

FNDGSDE

S

AQKYK

DVEG

DKL CACCGCCCAAAAGTACAAGACCGACGTGGAG protein 1 precursor,NVIDETINGINSTLDELLELGNNCQLH GGCATCATCG ACAAGCTGAACGTCATCGACGA putativeRTFLISSSLNNKIAKFLVEIREQKENTK GACGATCAACGGCATCAACAGCACCCTGGACKCFQYVKRNHQHLANFVSELHKTQG GAGCTCCTGGAGCTCGGCAACAACTGCCAACTGIFENVNLVDNTPDADKYYHEFMEIE CCACAGGACCTTCCTGATCTCCAGCTCCCTCAAQEATKIVKDIKKEIYHLNDDVDEPVLE CAACAAGATCGCCAAGTTCCTCGTGGAGATCAKRIKDVINTYNKLKTKKVQMDQSYKN GGGAGCAGAAGGAGAACACCAAGAAGTGCTTMYITKLREVEGSHDLFNQVAQLIRGE CCAATACGTGAAGCGCAACCACCAGCACCTGGTDKKGKALSERENNLHSIYNFVKLHE CCAACTTCGTCTCCGAGCTCCACAAGACCCAATELHNLYAKYTPEYMEKINKIFDDINA GGCGGCATCTTCGAGAACGTCAACCTGGTGGRMIAVDLNDDHSSEYSDVKRHEHHEA ACAACACCCCAGACGCCGACAAGTACTACCACMLLMDATNNLSKEVEMMQNESGGK GAGTCATGGAGATCGAGCAAGAGGCCACCANDGINGGKSQLVEDYTNTMSEFTEQ AGATCGTCAAGGACATCAAGAAGGAGATCTAAKTVAKKIHDSKGDYANMFDHIRENE CCACCTGAACGACGACGTGGACGAGCCAGTCAMLERIDLKKKDIKEILAHLNRMKEYLL CTGGAGAAGAGGATCAAGGACGTGATCAACAKKLSEEEKLHHMREKLEEVNTSTDEI CCTACAACAAGCTGAAGACCAGAAGGTCCAVKKFRTYDQMVDISQNIDIKNVQSKR GATGGACCAGTCCTACAAGAACATGTACATCAYDSVDEIDKEMSYIKTHNKDLIDSKFIV CCAAGCTGAGGGAGGTGGAGGGCAGCCACGERALENDKRKKSEMAQIFSTISRDNS ACCTGTTCAACCAAGTCGCCCAGCTCATCAGGSMYEYAKSFFDSVLKEIEKLTQMIRN GGCGAGACGGACAAGAAGGGCAAGGCCCTGTMDKLINENEAVMEKLKDQRRELQNV CCGAGCGCGAGAACAACCTCCACAGCATCTACENASTDLGKLEEVDKMAQTKSETELS AACTTCGTGAAGCTGCACGAGACGGAGCTCCERNDSRNAKDGATYSTLMDDKETDS ACAACCTGTACGCCAAGTACACCCCAGAGTACVNGEETKQENVVVKKGLPPQTDIYTS ATGGAGAAGATCAACAAGATCTTCGACGACATVVLKNDRNDQKSEKIGEKKSNKPVGT CAACGCCAGGATGATCGCCGTGGACCTCAACEENIQHSSYLNNDNSNNDIDVGTLYT GACGACCACAGCTCCGAGTACAGCGACGTCALGGYNAPNDNYNTNESGDDINEEAK AGCGCCACGAGCACGAGGCCATGCTCCTGATKKRNAVLFVYVGGLFSALFICIGAVFY GGACGCCACCAACAACCTGTCCAAGGAAGTG 53PvDBP (region II); PVX_110810 mGEHKTDSKTDNGKGANNLVMLDYEACAACGGCAAGGGCGCCAACAACCTGGTCAT ;Duffy receptorTSSNGQPAGTLDNVLEFVTGHEGNS GCTCGACTACGAGACGTCCTCCAACGGCCAGCprecursor (DBP) RKNSSNGGNPYDIDHKKTISSAIINHACAGCTGGCACCCTGGACAACGTGCTGGAGTTC FLQNTVMKNCNYKRKRRERDWDCNGTCACCGGCCACGAGGGCAACAGCAGGAAGA TKKDVCIPDRRYQLCMKELTNLVNNTACTCCAGCAACGGCGGCAACCCATACGACATC DTNFHRDITFRKLYLKRKLIYDAAVEGGACCACAAGAAGACCATCTCCAGCGCCATCAT DLLLKLNNYRYNKDFCKDIRWSLGDFCAACCACGCCTTCCTGCAGAACACCGTGATGA GDIIMGTDMEGIGYSKVVENNLRSIFGAGAACTGCAACTACAAGAGGAAGAGGCGCGA TDEKAQQRRKQWWNESKAQIWTAMGCGCGACTGGGACTGCAACACCAAGAAGGAC MYSVKKRLKGNFIWICKLNVAVNIEPGTCTGCATCCCAGACAGGCGCTACCAACTCTG QIYRWIREWGRDYVSELPTEVQKLKECATGAAGGAGCTGACCAACCTCGTGAACAACA KCDGKINYTDKKVCKVPPCQNACKSCCGACACCAACTTCCACAGGGACATCACCTTC YDQWITRKKNQWDVLSNKFISVKNACGCAAGCTGTACCTCAAGAGGAAGCTGATCTA EKVQTAGIVTPYDILKQELDEFNEVAFCGACGCTGCTGTGGAGGGCGACCTCCTGCTCA ENEINKRDGAYIELCVCSVEEAKKNTAGCTCAACAACTACAGGTACAACAAGGACTTC QEVVhhhhhhTGCAAGGACATCCGCTGGTCCCTGGGCGACTT CGGCGACATCATCATGGGCACCGACATGGAGGGCATCGGCTACTCCAAGGTGGTCGAGAACA ACCTCCGCAGCATCTTCGGCACCGACGAGAAGGCCCAACAGAGGCGCAAGCAATGGTGGAACG AGTCCAAGGCCCAGATCTGGACCGCCATGATGTACAGCGTGAAGAAGAGGCTGAAGGGCAACT TCATCTGGATCTGCAAGCTCAACGTGGCCGTCAACATCGAGCCACAGATCTACAGGTGGATCAG GGAGTGGGGCAGGGACTACGTCTCCGAGCTGCCAACCGAGGTGCAAAAGCTCAAGGAGAAGT GCGACGGCAAGATCAACTACACCGACAAGAAGGTGTGCAAGGTCCCACCATGCCAAAACGCCT 54 MSP3.10[merozoite PVX_097720 mV

GGSPNNEAPNSS

L

NGFPG CCCCAAACTCCAGCAGGCACCACCTCCGCAAC surface protein 3 alphaKNDSLPHEEPNNLEGKNESSDQCDTI GGCTTCCCAGGCAAGAACGACTCCCTCCCACA MSP3a)]NLGQVTEKEKKTIEQASVQAQDATKP CGAGGAGCCAAACAACCTGGAGGGCAAGAACEANNAEQIQAELQKVKTAKDESATAA GAGTCCAGCGACCAATGCGACACCATCAACCTKDAETAKKNAVDAGKGLDAAKGAIKK GGGCCAGGTGACCGAGAAGGAGAAGAAGACAEEAAAEAKKQAGIAEKAEKDAEAAG CATCGAGCAAGCTAGCGTCCAAGCTCAGGACKKDKLEDVNSQVQIAVEASTKAKDKK GCTACCAAGCCAGAGGCCAACAACGCCGAGCTEAEIAVEIVKAVVAKEEAQKASDEAQ AAATCCAGGCCGAGCTCCAAAAGGTGAAGACKACEKAQKAHAKAQKASDTTKTVETF CGCTAAGGACGAGTCCGCTACCGCTGCTAAGKTNAEAAAKNAKEKAGNANKAATEA GACGCTGAGACGGCCAAGAAGAACGCTGTGGESANELSVAKQKAKDAEEAAKEAKKE ACGCTGGCAAGGGCCTGGACGCCGCCAAGGGQVKAEIAAEVAKAKVAKEEADAAQKK CGCCATCAAGAAGGCTGAGGAAGCCGCCGCCAEAAKKIVDKIAQDTKVPEAQREAKLA GAGGCCAAGAAGCAGGCTGGCATCGCCGAGATQTASKATEAATEAGKKAQEAEESS AGGCTGAGAAGGACGCTGAGGCTGCTGGCAAKEAEEKAETSDAVKGKADAAEKAAG GAAGGACAAGCTGGAGGACGTGAACAGCCAAEAKKASIETEIAIEVAKAEVLNAEVKKT GTCCAGATCGCCGTGGAGGCCTCCACCAAGGAQEAEKDATEAKEQAEKAKAAAEEA CCAAGGACAAGAAGACCGAGGCCGAGATCGCKTHGEKAEKVGESTKAHSDEAQQEN CGTGGAGATCGTCAAGGCCGTGGTCGCCAAGKNAKDASEEAENRAVDALEEAYAVE GAAGAGGCCCAAAAGGCTAGCGACGAGGCTCAHLARTKNAAESAKSATDMSELEKAK AGAAGGCTTGCGAGAAGGCCCAAAAGGCTCAEEAIDAANIAHQKWLKATQAATIAKEK CGCTAAGGCTCAGAAGGCTTCCGACACCACCAKEAAKVAAEKAQTAANVVKDKAAKA AGACCGTGGAGACGTTCAAGACCAACGCCGAEAKKAETEAVKAAVEARAAAEEAKQE GGCTGCCGCCAAGAACGCCAAGGAGAAGGCTAAKVGASKEPQETKNKANVEAEATG GGCAACGCTAACAAGGCTGCTACCGAGGCTGNEAKKAEDAAEEAKEAAKKANEATD AGAGCGCTAACGAGCTCTCCGTGGCCAAGCAYGLLKTKNQYVLEPLDISPESADNITS GAAGGCCAAGGACGCCGAGGAAGCCGCCAAKEEQVKEEMEDQGDEDSNEAEVEEA GGAAGCCAAGAAGGAGCAAGTCAAGGCTGAGATCGCTGCTGAGGTGGCTAAGGCTAAGGTG 55 sexual stage antigen PVS_000930mENNKIKGGKVPPPSVPTGNNSDNN ATGGAGAACAACAAGATCAAGGGCGGCAAGG s16, putativeVPKKDGGENNPPPDAENALQELKNF TGCCACCACCATCCGTCCCAACCGGCAACAACTKNLEKKTTTNRNIIISTTVNMVLLVLL TCCGACAACAACGTGCCAAAGAAGGACGGCGSGLIGYNTKKGFKKGQMGSVKEVTP GCGAGAACAACCCACCACCAACGCCGAGAA EAQKGKLhhhhhhCGCCCTCCAAGAGCTGAAGAACTTCACCAAGA ACCTGGAGAAGAAGACCACCACCAACAGGAACATCATCATCTCCACCACCGTCATCAACATGGT GCTCCTGGTCCTCCTGAGCGGCCTGATCGGCTACAACACCAAGAAGGGCTTCAAGAAGGGCCA AATGGGCTCCGTGAAGGAAGTGACCCCAGAGGCCCAGAAGGGCAAGCTCCACCACCACCACCA CCACTGA 56 Positive Control? 57Negative Control?

indicates data missing or illegible when filed

Appendix II

TABLE 5 list of protein references for additional 25 proteins ProteinProtein Code Protein Name Reference Source X1 PVX_094350 PVX_094350Ehime University X2 PVX_099930 PVX_099930 Ehime University X3 PVX_114330PVX_114330 Ehime University X4 PVX_088820 PVX_088820 Ehime University X5PVX_080665 PVX_080665 Ehime University X6 PVX_092995 PVX_092995 EhimeUniversity X7 PVX_087885 PVX_087885 Ehime University X8 PVX_003795PVX_003795 Ehime University X9 PVX_087110 PVX_087110 Ehime UniversityX10 PVX_087670 PVX_087670 Ehime University X11 PVX_081330 PVX_081330Ehime University X12 PVX_122805 PVX_122805 Ehime University V1 RBP1b(P7) PVX_098582 WEHI V2 RBP2a (P9) PVX_121920 WEHI V3 RBP2b (P25)PVX_094255 WEHI V4 RBP2cNB (M5) PVX_090325 WEHI V12 RBP1a (P5)PVX_098585 WEHI V5 RBP2-P2 (P55) PVX_101590 WEHI V11 PvEBP KMZ83376.1IPP V10 Pv DBPII (AH) AAY34130.1 IPP V13 Pv DBPII (SaII) PVX_110810 IPPV6 PvDBP R3-5 PVX_110810 WEHI V7 PvGAMA PVX_088910 WEHI V8 PvRiprPVX_095055 WEHI V9 PvCYRPA PVX_090240 WEHI

List of Protein Sequences (Insert aa Sequence)

X1: ENPVRHSVDIKSEDFVVLISLQNLQTFIMIGYTAVNKDHLNFDFSYLWALCIGTGLFIYSLISFVLIRSLALSKIDIGKYVLELLFSLSIIATCSLSIIIDSFKIANMQLLFFSFALTGYAYYNLMSLFFFCTLVGMTIQYNLSFTGFRAHSTSFFFLDMLSYLVQMIGGNILYFRMYELCTLIVISKRNPCKYVVASKEVKQVEKQIFSSLENSYMCIKSKTYSDLTCTNDLLNKDSQSVVGRDTNPKWNSPIGTSYQDKVNHTKKLLLRRGKRDKRYPKGGGGARLTCAKHSAYHNSRSLANCASKNTPICTTNFRISNTLSLKNHFNPNLTLEASPPVCKKCVSEKNSHKDNEYKNGEERKKAKRGIKSGTANKSNQLGNHGGDATQVANPTYRTTSHGGDATQVAYPTYRTTSHGGDATQVDSPTHPTTSHGGNNSSSGHPQDDEVLIPIRGTNATNDAAATYNSNASWIKTAAVIDVSVEGKQKKGGHQTFAGNPVNSSANFPSDKKPSYNSHRNGGTPPPNEQLRYYACPCYQTHSSGSSLSEVPSGQTTKRKNSAHNSVEGGNPKMDNQQSRRVSNKRVDGATGEEHDHPSDPPADNPNGNSNTYHC X2ELSHSLSVKNAPDASALNIEVEKDKKKICKNAFQYINVAELLSPREEETYVQKCEEVLDTIKNDSPDESAEAEINEFILSLLHARSKYTIINDSDEEVLSKLLRSINGSISEEAALKRAKQLITFNRFIKDKAKVKNVQEMLVISSKADDFMNEPKQKMLQKIIDSFELYNDYLVILGSNINIAKRYSSETFLSIKNEKFCSDHIHLCQKFYEQSIIYYRLKVIFDNLVTYVDQNSKHFKKEKLLELLNMDYRVNRESKVHENYVLEDETVIPTMTITDIYDQDRLIVEVVQDGNSKLMHGRDIEKREISERYIVTVKNLRKDLNDEGLYADLMKTVKNYVLSITQIDNDISNLVRELDHED VEK X3LPWTKKRKAVNQMGIIKDMSQELRTKAEQLPTPEDISAKIHRVDKEVIDKLNKDIIEEENLDKHKPHVCQEPAYERDYSYLCPEDWVKNSNDQCWGIDYDGHCEALKYFQDYSVEEKKEFEMNCCVLWPKLKNEGMKGAHKKDLLRGSISSNNGLIIKPKYL X4ELKKNNAALTSQRSSSRTTSTRSYKNAPKNSTSFLSRLSILIFALSCAIFVNTASGAAANRPNANGFVSPTLIGFGELSIQESEEFKRMAWNNWMLRLESDWKHFNDSVEEAKTKWLHERDSAWSDWLRLQSKWSHYSEKMLKEHKSNVMEKSANWNDTQWGNWIKTEGRKILEAQWEKWIKKGDDQLQKLILDKWVQWKNDKIRSWLSSEWKTEEDYYWANVERATTAKWLQEAEKMHWLKWKERINRESEQWVNWVQMKESVYINVEWKKWPKWKNDKKILFNKWSTNLVYKWTLKKQWNVWIKEANTAPQV X5KGVTLSCVFSHASEEREGGTGTFALSNEPIYYAPSGGLAPCALISRGLSGDEEGSGEDGGEDGDGDGGEDSAEDNAEDGDDDGGEDGGLPGGRFPYEEGKKSSLVSDAPSDLLDGDADEHAAEDGGAKRKMSKKEEEAEDNKIDKLVNAEMKKLEAGEEANKDPDAEPEKEDQGSGQGQRAKLRCSNKLNYIQVTANGQREGDLFGENDGESAPAFVEIPHEVEEESGGVPTKHDEAGEAAAAEEPHNRVDRAEKENNAKDLKFVEGERERQRSSPPSNGYSQNSFVELKGVPDKLPPNFTNSLGSSPTHSNLEKPVYKHLPWSILASDSGSNTGSWADVNSSTYNVSPFSFTSIRSGNSLHLLPMNFQIQNSIVKVTDEEYDKLKLKNSVKVYDKNALVDYKYEIFEVKEGEEYNDGNDPYEERNGEEGDAGGEGGSDGEGDADSKSYQNNKSDGRGFFDGTLVTYTIIILAGVIILLLSFVIYYYDIINKVKRRMSAKRKNNKSMAIANDTSAGMYMGDTYMENPH V X6SQGCSGYRLPPPKRWFTFTSRPYCKTAAYYELKHMPYYVDAVSASENVKHEKWNNWLKEMKISLTEKLEKESQEYMEKLEQQWDEFMKNSEDKWRHYNPQMEEEYQCSVYPLGLKWDDEKWTAWFYEKGLWCLKKSFKTWLTDSKKGYNTYMKNLLQEFGKQFYEDWCRRPEKRREDKICKRWGQKGLRNDNYYSLKWMQWRNWKNRNHDQKHVWVTLMKDALKEYTGPEFKLWTEFRKEKIDFYKQWMQAFAEQWTQDKQWNTWTEERNEYMKKKKEEEAKKKAASKKKAASKKGGAAKKAPAKKAPTKKAAPGTKAPAKKAAPKKVAAPNAA X7KEAVKKGSKKAMKQPMHKPNLLEEEDFEEKESFSDDEMNGFMEESMDASKLDAKKAKTTLRSSEKKKTPTSGMSGMSGSGATSAATEAATNMNATAMNAAAKGNSEASKKQTDLSNEDLFNDELTEEVIADSYEEGGNVGSEEAESLTNAFDDKLLDQGVNENTLLNDNMIYNVNMVPHKKRELYISPHKHTSAASSKNGKHHAADADALDKKLRAHELLELENGEGSNSVIVETEEVDVDLNGGKSSGSVSFLSSVVFLLIGLLCFTN X8NLSNDCKKGANNSFKLIVHTSDDILTLKWKVTGEGAAPGNKADVKKYKLPTLERPFTSVQVHSANAKSKIIESKFYDIGSGMPAQCSAIATNCFLSGSLEIEHCYHCTLLEKKLAQDSECFKYVSSEAKELIEKDTPIKAQEEDANSADHKLIESIDVILKAVYKSDKDEEKKELITPEEVDENLKKELANYCTLLKEVDTSGTLNNHQMANEEETFRNLTRLLRMHSEENVVTLQDKLRNAAICIKHIDKWILNKRGLTLPEEGYPSEGYPPEEYPPEELLKEIEKEKSALNDEAFAKDTNGVIHLDKPPNEMKFKSPYFKKSKYCNNEYCDRWKDKTSCMSNIEVEEQGDCGLCWIFASKLHLETIRCMRGYGHFRSSALFVANCSKRKPEDRCNVGSNPTEFLQIVKDTGFLPLESDLPYSYSDAGNSCPNKRNKWTNLWGDTKLLYHKRPNQFAQLGYVSYESSRFEHSIDLFIDILKREIQNKGSVIIYIKTNNVIDYDFNGRVVHSLCGHKDADHAANLIGYGNYISAGGEKRSYWIVRNSWGYYWGDEGNFKVDMYGPEGCKRNFIHTAVVFKIDLGIVEVPKKDEGSIYSYFVQYVPNFLHSLFYVSYGKGADKGAAVVTGQAGGAVVTGQTETPTPEAAKNGDQPGAQGSEAEVAEGGQAGNEAPGGLQESAVSSQTSEVTPQSSITAPQIGAVAPQIGAAAPQIDVAAPQIDVVAPQTRSVDAPQTSSVAAHPPNVTPQNVTLGEGQHAGGVGSLIPADN X9ETLLDSETLKNYEKETNEYIRKKKVEKLFDVILKNVLVNKPENVYLYIYKNIYSFLLNKIFVIGPPLLKITPTLCSAIASCFSYYHLSASHMIESYTTGEVDDAAESSTSKKLVSDDLICSIVKSNINQLNAKQKRGYVVEGFPGTNLQADSCLRHLPSYVFVLYADEEYIYDKYEQENNVKIRSDMNSQTFDENTQLFEVAEFNTNPLKDEVKVYLRN X10YPKKNFDKPDPTSPYQGQYGESEEQRQGYGIPPNPTMINLTGNQDQRPNVLQQFGINNKNVMQFLINMFVYVAAILVSLKIWDYMSYSKCDYYKDLLLRIVRYQSHMNDGKMA X11SRIDKQPIQSSYLFQDNAVPPVRFSAVDADLFSIGVVHTEEQIFMDDANWVISSVPSKYLNLHLLKTGSRPHFSHFSVSMNTGCNLFIASPVGETFPLSPSKDGATWKAFETDDSVEVIHRETKEKRIYKLKFIPLKSGALLKVDVLKGIPFWVISQGRKILPTICSGDEEVLSNPQNEVFKECTSSSSLSPEFDCLAGLSTYHRDKKNHTWKTSSGSIGQFIKIFFNKPVQITKFRFKPRDDLLSWPSEVALQFDTDEEVIIPILHTHNMGQNTTRLEHPIITTSVKVEVRDMYERASENTGGSFEVIGSTCQMMEDDYMTHHAVIDITECDRRLESLPDVMPLTKGSKFLAICPRPCLSSSNGGVIYGSDVYSTDSAVCGAAVHAGVCSREGEGSCHFLVVVRGGRANFVGALQNNVLSLSRGGGGSGSGSSTSSDGDGDSDSSTSRANFSFSLSSASGFGGGPRGAHAEAAPSSYSIVFKPRDHLAPTNGFLVDSGREFTSYGSVAYGWKREVSPSSSFSSPSPSYTSPPLEEPTLLRGDSSSFNGIYSGGIEFPPASASQNCISQLDCQTNFWKFQMQENGTYFVQVLVGNKTSPEKQKAFVELNGVPIIKGVDLGPDEVFNATDRVQVTNRALVLTSTCLGGESACSRARVSIMA VQIVKT X12NGMNKDKDAEITPPPFIVLPGGKKIHMLQSEYEYDVLRDMYRTDEANGGSGEKESHPSGDGAIRRNEFFKLFHHREGHYKFVIKNVPTKLSDLLQKGGNEQETDLFPLLYRSLQFACSADGTWPYARREVAFFKNGSVHCEAEFQNELSVRRTPRSGKKSFGRFPRGTLIKSSDLRSKIVEGNSYDKRAAPLKSEKKKKALFLHPESVLYKMEEIFFYENPSVKSEIVGFVLFHDVCTVTSLGHGAHPVNSPFLGSDLLEMIFGYCILHGFKKIRVKSESLNYETGIRTSFIEILLNGKTALEHLGLRLTNVAKFSKELYYVITGYTWKSDLVLSPIVRFEHDLYVHHDIEERFFLYVNKMYRNMLHDLSFSCDENYYPYKNCYDIYPSVRRSQNNLCLFELNPIYEELKELFPDSCNIGQRVRKCYEEIKKNVVCTHNGEGGEDGCKYYQFIVNTFIKPRRKTSFFIYHNMYVQEYLSKKSYPYYLLLSEVIKNEENNFLEKGNYDLVADAQTHLFLNYVLQNSTFFIFWNFSTEFWKRFRYIQAGPTGATSTPQKGQAVFCPMAYAYEFVEHLDTFYVRG V6SVEEAKKNTQEVVTNVDNAAKSQATNSNPISQPVDSSKAEKVPGDSTHGNVNSGQDSSTTGKAVTGDGQNGNQTPAESDVQRSDIAESVSAKNVDPQKSVSKRSDDTASVTGIAEAGKENLGASNSRPSESTVEANSPGDDTVNSASIPVVSGENPLVTPYNGLRHSKDNSDSDGPAESMANPDSNSKGETGKGQDNDMAKATKDSSNSSDGTSSATGDTTDAVDREINKGVPEDRDKTVGSKDGGGEDNSANKDAATVVGEDRIRENSAGGSTNDRSKNDTEKNGASTPDSKQSEDATALSKTESLESTESGDRTTNDTTNSLENKNGGKEKDLQKHDFKSNDTPNEEPNSDQTTDAEGHDRDSIKNDKAERRKHMNKDTFTKNTNSHHLN V7IRNGNNPQALVPEKGADPSGGQNNRSGENQDTCEIQKMAEEMMEKMMKEKDVFSSIMEPLQSKLTDDHLCSKMKYTNICLHEKDKTPLTFPCTSPQYEQLIHRFTYKKLCNSKVAFSNVLLKSFIDKKNEENTFNTIIQNYKVLSTCIDDDLKDIYNASIELFSDIRTSVTEITEKLWSKNMIEVLKTREQTIAGILCELRNGNNSPLVSNSFSYENFGILKVNYEGLLNQAYAAFSDYYSYFPAFAISMLEKGGLVDRLVAIHESLTNYRTRNILKKINEKSKNEVLNNEEIMHSLSSYKHHAGGTRGAFLQSRDVREVTQGDVSVDEKGDRATTAGGNQSASVAAAAPKDAGPTVAAPNTAATLKTAASPNAAATNTAAPPNMGATSPLSNPLGTSSLQPKDVAVLVRDLLKNTNIIKFENNEPTSQMDDEEIKKLIESSFFDLSDNTMLMRLLIKPQAAILLIIESFIMMTPSPTRDAKTYCKKALVNGQLIETSDLNAATEEDDLINEFSSRYNLFYERLKLEEL V8KEYCDQLSFCDVGLTHHFDTYCKNDQYLFVHYTCEDLCKTCGPNSSCYGNKYKHKCLCNSPFESKKNHSICEARGSCDAQVCGKNQICKMVDAKATCTCADKYQNVNGVCLPEDKCDLLCPSNKSCLLENGKKICKCINGLTLQNGECVCSDSSQIEEGHLCVPKNKCKRKEYQQLCTNEKEHCVYDEQTDIVRCDCVDHFKRNERGICIPVDYCKNVTCKENEICKVVNNTPTCECKENLKRNSNNECVFNNMCLVNKGNCPIDSECIYHEKKRHQCLCHKKGLVAINGKCVMQDMCRSDQNKCSENSICVNQVNKEPLCICLFNYVKSRSGDSPEGGQTCVVDNPCLAHNGGCSPNEVCTFKNGKVSCACGENYRPRGKDSPTGQAVKRGEATKRGDAGQPGQAHSANENACLPKTSEADQTFTFQYNDDAAIILGSCGIIQFVQKSDQVIWKINSNNHFYIFNYDYPSEGQLSAQVVNKQESSILYLKKTHAGKVFYADFELGHQGCSYGNMFLYAH REEA V9SKNIIILNDEITTIKSPIHCITDIYFLFRNELYKTCIQHVIKGRTEIHVLVQKKINSAWETQTTLFKDHNWFELPSVFNFIHNDEIIIVICRYKQRSKREGTICKRWNSVTGTIYQKEDVQIDKEAFANKNLESYQSVPLTVKNKKFLLICGILSYEYKTANKDNFISCVASEDKGRTWGTKILINYEELQKGVPYFYLRPIIFGDEFGFYFYSRISTNNTARGGNYMTCTLDVTNEGKKEYKFKCKHVSLIKPDKSLQNVAKLNGYYITSYVKKDNFNECYLYYTEQNAIVVKPKVQNDDLNGCYGGSFVKLDESKALFIYSTGYGVQNIHTLYYTRYD

List of Polynucleotide Sequence (Insert bp Sequence)

X1 GAGAACCCCCGTGAGGCACTCGGTGGACATAAAGTCGGAAGACTTCGTCGTCCTGATTTCGCTCCAAAACCTGCAGACCTTCATCATGATAGGGTACACAGCCGTGAACAAAGACCACCTGAATTTCGACTTCTCCTACTTATGGGCCCTCTGCATCGGGACGGGCCTCTTCATATACTCCCTCATCAGCTTTGTACTCATAAGATCCCTAGCACTGTCAAAAATAGACATAGGCAAATACGTCCTGGAGCTGCTATTCAGTTTGAGTATAATCGCCACATGTTCACTCTCCATAATAATTGACTCTTTCAAAATAGCCAACATGCAGTTGCTTTTTTTTTCGTTCGCTTTAACGGGCTATGCCTACTACAATTTGATGAGCCTCTTCTTTTTCTGCACACTGGTAGGAATGACCATTCAGTACAATTTAAGTTTCACTGGGTTCAGAGCGCATTCGACTTCTTTCTTCTTTTTAGATATGCTATCTTACCTAGTGCAAATGATAGGAGGGAACATCCTCTACTTTCGCATGTACGAGCTGTGTACCCTAATCGTCATTTCGAAGAGGAACCCCTGCAAGTATGTTGTCGCATCGAAGGAAGTGAAACAAGTGGAGAAGCAAATTTTCTCTTCTTTATTTAATTCTTACATGTGCATCAAGTCCAAAACTTATTCAGATTTAACCTGCACTAATGATCTGTTAAATAAAGACAGTCAATCTGTTGTCGGTAGGGATACGAACCCTAAGTGGAACTCCCCCATTGGTACTTcCTACCAGGATAAGGTCAATCATACGAAGAAGTTACTCCTTCGGAGGGGAAAACGGGACAAACGCTACCCCAAAGGGGGAGGGGGAGCTCGACTAACATGTGCAAAACATAGTGCCTACCATAATAGCCGAAGTCTTGCCAACTGTGCCAGTAAGAATACCCCCATTTGCACAACTAACTTTAGGATATCTAACACCCTTTCACTTAAAAATCATTTCAACCCTAACCTAACCTTAGAAGCGTCTCCCCCCGTTTGTAAAAAATGCGTTTCGGAAAAGAATAGCCATAAGGATAATGAGTACAAAAACGGGGAAGAGAGAAAAAAAGCAAAACGTGGTATCAAGTCGGGCACTGCAAACAAGTCTAACCAGTTGGGCAACCACGGGGGGGACGCTACGCAGGTGGCTAATCCTACCTACAGAACTACTTCCCACGGGGGGGACGCAACCCAGGTGGCTTATCCTACCTACAGAACTACTTCCCACGGGGGGGACGCAACGCAGGTGGATAGTCCTACCCACCCAACTACCTCCCATGGGGGGAACAACTCGTCGAGCGGGCACCCCCAAGACGACGAAGTGCTCATCCCCATTAGGGGAACCAACGCCACTAACGATGCAGCCGCCACCTACAACTCGAACGCTAGTTGGATCAAAACCGCTGCGGTTATTGACGTGTCTGTGGAGGGGAAGCAGAAAAAGGGGGGACATCAAACGTTCGCGGGCAATCCCGTAAATTCATCCGCTAATTTCCCATCGGACAAGAAACCTTCCTACAACTCGCACCGCAACGGAGGTACTCCCCCCCCAAATGAACAACTCAGGTACTACGCCTGCCCCTGCTACCAGACCCACTCCAGCGGATCGTCCCTCAGTGAGGTGCCCTCGGGACAAACGACGAAGCGGAAAAATAGTGCGCACAACTCGGTTGAAGGGGGAAACCCCAAAATGGATAATCAGCAAAGTCGCCGCGTGAGTAACAAGCGGGTAGATGGCGCAACGGGTGAGGAACATGACCACCCAAGTGACCCCCCCGCAGATAACCCAAATGGAAACTCCAACACCTACCACTGC X2GAGCTGAGCCACAGCTTGTCCGTGAAGAACGCGCCGGACGCGAGCGCGCTGAACATCGAGGTGGAGAAGGACAAAAAGAAGATCTGCAAAAACGCATTCCAATACATAAACGTAGCTGAGCTGTTTGTCCCCAAGGGAGGAAGAAACCTACGTGCAGAAATGTGAAGAGGTCCTAGACACAATAAAGAATGACAGTCCAGATGAATCGGCAGAAGCAGATAAACGAATTTATACTGAGCTTACTGCACGCTCGTTCTAAGTATACCATAATAAATGACTCAGATGAGGAGGTACTGAGCAAGCTCCTGAGGAGTATCAACGGATCGATAAGTGAAGAGGCAGCGTTGAAGAGAGCCAAACAGCTAATCACATTCAATCGGTTTATAAAAGACAAAGCGAAGGTAAAAAATGTGCAAGAGATGCTAGTAATAAGTAGCAAAGCAGATCACTTCATGAATGAGCCGAAGCAAAAAATGCTCCAAAAAATTATAGATTCGTTTGAACTGTATAATGATTACCTAGTCATTTTAGGGTCAAATATTAACATCGCCAAGAGGTACTCCTCAGAAACGTTTCTTTCTATTAAAAATGAAAAGTTCTGCTCAGACCACATCCACTTATGCCAGAAGTTCTACGAGCAGTCTATCATTTACTACAGATTGAAGGTTATTTTTGATAACCTGGTGACTTATGTAGATCAAAATTCCAAGCATTTTAAAAAGGAAAAGTTGCTGGAGCTTCTAAATATGGATTATAGGGTCAATCGAGAGTCGAAGGTGCATGAAAATTACGTGCTGGAGGATGAGACGGTCATCCCCACGATGCGCATTACAGACATTTACGATCAAGATAGGCTAATTGTTGAGGTCGTTCAGGATGGAAATAGCAAGCTGATGCACGGCAGGGATATTGAGAAGAGGGAAATCAGCGAGAGGTACATCGTCACCGTGAAGAACCTGCGCAAGGACCTCAACGACGAGGGGCTCTACGCCGACTTGATGAAGACCGTCAAGAACTACGTGCTCTCCATCACGCAGATCGACAACGACATTTCCAACCTCGTGCGCGAGCTCGACCACGAGGATGTGGAGAAG X3CTACCATGGACGAAGAAAAGAAAGGCGGTGAACCAAATGGGCATCATAAAAGATATGTCGCAGGAGCTTAGGACTAAGGCCGAACAGCTTCCAACCCCCGAGGATATATCAGCCAAAATTCACAGAGTAGATAAAGAGGTCATCGATAAGTTAAACAAAGACATCATAGAGGAAGAAAATTTAGACAAGCACAAACCGCACGTCTGCCAGGAGCCAGCATACGAGAGGGACTATTCGTACCTATGTCCCGAAGACTGGGTGAAGAACTCCAACGATCAGTGCTGGGGCATAGACTACGATGGTCACTGTGAAGCGCTAAAATATTTTCCAAGATTATTCTGTAGAGGAGAAAAAAGAATTTGAAATGAACTGCTGCGTCTTGTGGCCTAAGCTAAAAAATGAAGGCATGAAAGGAGCGCACAAGAAGGACCTCCTAAGGGGATCGATAAGTTCAAACAATGGGTTAATAATAAAGCCGAAATATTTG X4GAATTGAAGAAGAACAATGCCGCGTTGACCTCACAAAGGTCATCTTCTAGAACCACATCCACAAGGAGCTACAAAAATGCCCCAAAAAATTCCACTTCATTCCTTTCTCGTTTATCTATTCTGATATTTGCCTTATCATGTGCTATTTTTGTAAATACTGCATCAGGGGCGGCAGCTAATAGACCAAACGCGAATGGCTTCTGTGTCACCTACTTTAATAGGATTTGGCGAATTAAGCATCCAAGAATCAGAAGAATTCAAAAGAATGGCTTGGAATAATTGGATGTTGCGATTGGAGTCCGACTGGAAACATTTTAACGATTCTGTTGAAGAAGCCAAAACCAAATGGCTTCATGAAAGAGACTCAGCTTGGTCTGATTGGCTTCGTTCCTTGCAAAGTAAATGGTCTCACTATAGTGAAAAAATGCTTAAAGAACACAAAAGTAATGTTATGGAAAAATCAGCCAACTGGAATGACACGCAATGGGGAAATTGGATAAAAACTGAAGGAAGAAAAATTCTAGAAGCGCAATGGGAAAAATGGATTAAAAAAGGTGATGACCAATTACAAAAGTTAATTTTAGATAAATGGGTTCAATGGAAAAATGATAAGATCCGATCCTGGTTATCCAGTGAATGGAAAACCGAAGAAGATTACTACTGGGCAAATGTAGAGCGCGCTACAACAGCAAAATGGTTGCAAGAAGCAGAGAAAATGCATTGGCTTAAATGGAAAGAAAGAATTAACAGAGAGTCTGAACAATGGGTGAACTGGGTCCAAATGAAAGAAAGCGTTTACATCAATGTAGAATGGAAAAAATGGCCCAAATGGAAAAATGATAAAAAAATTCTATTTAACAAATGGTCAACTAACCTTGTCTACAAATGGACACTGAAAAAGCAGTGGAACGTTTGGATTAAGGAAGCAAATACTGCACCCCAAGTT X5AAGGGTGTCACCTTGAGTTGCGTTTTTTCCCATGCGAGTGAGGAACGTGAGGGTGGCACAGGGACATTTGCTTTGAGCAATGAGCCGATTTATTACGCCCCTAGTGGGGGGCTGGCGCCGTGCGCGCTCATCAGCAGAGGGTTAAGCGGGGATGAGGAGGGTAGCGGCGAGGACGGCGGTGAAGATGGCGACGGAGATGGTGGTGAAGACAGCGCTGAGGACAACGCTGAGGATGGAGACGATGATGGTGGCGAAGATGGCGGCTTGCCCGGGGGACGCTTCCCATACGAAGAAGGAAAAAAGAGTAGCCTTGTGAGCGACGCACCCAGCGACCTCCTGGATGGAGATGCGGATGAACATGCCGCCGAAGATGGGGGAGCGAAGCGAAAGATGAGTAAGAAGGAGGAAGAGGCGGAGGATAACAAAATTGACAAGTTGGTAAATGCGGAAATGAAAAAGCTCGAGGCAGGGGAAGAGGCGAACAAGGATCCCGACGCAGAACCAGAAAAAGAGGACCAGGGAAGTGGCCAAGGACAAAGGGCGAAGCTGAGGTGCTCAAACAAGCTAAATTACATACAGGTGACGGCGAATGGCCAAAGGGAGGGCGACCTCTTTGGCGAGAACGACGGGGAGAGCGCCCCAGCTTTCGTGGAGATACCCCACGAGGTTGAGGAGGAAAGCGGCGGTGTGCCCACAAAGCATGACGAAGCGGGGGAAGCAGCTGCGGCGGAGGAACCACATAACCGCGTCGACCGAGCGGAAAAAGAAAACAACGCGAAGGACTTAAAATTTGTGGAGGGGGAGCGAGAAAGACAAAGGAGCAGCCCCCCCTCGAATGGATATTCCCAAAACAGCTTTGTCGAACTGAAAGGTGTGCCCGATAAATTGCCCCCTAATTTACCAACTCGCTTGGTAGCTCCCCAACGCACAGTAATTTGGAGAAACCAGTTTATAAGCACTTACCCTGGTCTATCCTGGCATCCGACTCTGGTTCGAACACCGGGTCCTGGGCAGACGTCAACAGTAGTACCTACAATGTGAGTCCATTCAGTTTCACCTCAATACGTAGTGGTAACTCTCTGCATCTACTGCCGATGAATTTCCAAATCCAAAACTCCATCGTGAAAGTAACTGATGAGGAGTATGACAAATTGAAGCTTAAAAACAGCGTCAAAGTGTATGACAAAAATGCCCTGGTAGATTATAAGTATGAAATTTTTGAGGTGAAGGAAGGGGAGGAATATAATGATGGGAATGACCCTTATGAGGAAAGGAATGGGGAAGAAGGGGATGCAGGTGGAGAGGGGGGTTCCGATGGGGAGGGAGATGCAGATTCTAAATCATATCAAAATAACAAATCGGATGGACGTGGGTTCTTCGATGGGACCTTAGTAACCTACACCATTATCATTTTAGCTGGTGTTATAATTCTGCTGCTAAGTTTTGTCATTTATTACTACGATATAATAAATAAGGTGAAGAGGCGAATGAGTGCCAAGCGGAAGAACAACAAATCTATGGCCATCGCGAATGATACATCCGCGGGGATGTACATGGGCGACACCTACATGGAGAATCCCCACGTT X6TCACAAGGATGTTCAGGATACCGTTTACCACCACCAAAAAGATGGTTTACCTTCACTTCTCGACCATACTGTAAAACAGCTGCATATTATGAACTTAAACATATGCCATATTATGTAGATGCAGTTAGTGCATCAGAAAACGTAAAACATGAGAAATGGAATAACTGGTTAAAAGAAATGAAAATATCATTAACTGAAAAATTAGAAAAAGAATCACAAGAATATATGGAAAAATTGGAACAGCAATGGGATGAATTMTGAAAAATTCAGAAGATAAATGGAGGCTATTATAATCCCCAAATGGAAGAAGAATATCAATGTAGTGTTTATCCACTTGGATTAAAATGGGATGATGAAAAGTGGACTGCATGGTTTTATGAAAAAGGATTATGGTGTTTGAAGAAAACTCTTTAAAACATGGCTCACTGATTCTAAAAAAGGTTACAACACCTACATGAAAAATCTTTTACAGGAATTTGGTAAACAATTTTATGAAGATTGGTGTCGTAGACCTGAAAAACGTCGTGAAGATAAAATTTGCAAGAGATGGGGACAAAAAGGATTACGTAATGACAATTACTATTCGTTAAAGTGGATGCAGTGGAGAAATTGGAAAAACAGAAACCACGATCAAAAACATGTGTGGGTAACTCTTATGAAGGATGCGCTAAAGGAATATACGGGGCCCGAATTCAAATTATGGACTGAGTTTAGAAAAGAAAAGATAGACTTTTACAAGCAATGGATGCAAGCTTTCGCCGAACAGTGGACACAAGACAAACAATGGAATACGTGGACTGAAGAAAGAAATGAATATATGAAAAAGAAAAAAGAAGAAGAAGCAAAAAAAAAAGCAGCATCAAAAAAAAAAGCAGCATCAAAAAAAGGAGGAGCAGCAAAAAAGGCACCAGCAAAAAAGGCACCAACAAAAAAAGCCGCACCAGGAACAAAGGCACCAGCAAAAAAAGCAGCACCTAAAAAAGTTGCAGCACCAAATGCA GCA X7AAGGAGGCAGTGAAGAAGGGGTCCAAGAAGGCAATGAAGCAGCCCATGCACAAGCCGAACCTTCTTGAAGAGGAAGACTTTGAGGAGAAAGAATCCTTTTCGGATGACGAGATGAATGGGTTCATGGAGGAGAGCATGGATGCTTCTAAGTTGGATGCGAAGAAGGCCAAGACGACCCTCAGGAGCTCGGAGAAGAAGAAGACTCCAACGAGCGGAATGAGTGGAATGAGTGGAAGCGGCGCCACCAGCGCAGCCACCGAGGCAGCCACGAACATGAACGCCACCGCCATGAACGCCGCTGCTAAGGGCAACAGCGAGGCGAGCAAAAAGCAAACCGACTTGTCCAACGAAGACCTGTTCAACGACGAGCTCACAGAAGAGGTCATTGCAGATTCGTACGAAGAGGGAGGAAACGTGGGAAGCGAGGAAGCCGAAAGCCTCACAAATGCATTTGACGACAAGCTACTAGACCAAGGAGTGAATGAAAATACTCTGCTGAACGACAACATGATTTACAACGTCAATATGGTTCCACATAAGAAGCGAGAATTATACATCTCCCCACACAAGCATACCTCTGCAGCAAGCAGTAAAAATGGCAAACATCATGCGGCGGACGCGGACGCTTTGGACAAAAAACTGAGGGCTCACGAGCTGCTCGAGCTGGAAAACGGAGAAGGCAGCAACTCAGTCATTGTCGAAACGGAAGAAGTGGATGTTGACCTAAACGGAGGAAAGTCAAGCGGCTCCGTGTCCTTCCTCAGCTCCGTAGTCTTCTTGCTCATCGGATTGTTATGTTT CACCAAT X8AACCTGAGCAACGATTGCAAAAAAGGAGCCAACAACAGCTTTAAGTTAATCGTGCACACCAGCGATGATATTTTGACACTCAAGTGGAAGGTCACTGGGGAAGGGGCAGCTCCAGGCAACAAAGCAGATGTAAAGAAGTACAAACTCCCTACCCTAGAGAGGCCTTTCACTTCCGTGCAAGTGCATTCAGCCAACGCCAAGTCGAAGATAATCGAAAGCAAATTTTACGACATTGGCAGCGGCATGCCAGCCCAGTGCAGCGCGATCGCCACGAACTGCTTCCTCAGCGGCAGCCTCGAAATCGAGCACTGCTACCACTGCACCCTGTTGGAGAAGAAGCTGGCCCAAGACAGCGAGTGCTTCAAGTACGTCTCGAGTGAAGCGAAGGAGTTGATCGAGAAAGACACGCCGATTAAAGCTCAAGAAGAAGACGCCAACTCTGCAGACCACAAACTGATCGAGTCCATAGACGTGATACTAAAGGCAGTGTACAAATCAGATAAAGATGAGGAAAAGAAGGAGCTCATCACCCCGGAGGAAGTGGACGAAAATTTGAAGAAAGAGCTAGCCAATTATTGTACCCTACTGAAGGAGGTAGACACAAGTGGCACTCTTAACAACCACCAGATGGCAAACGAAGAGGAAACGTTCAGAAATTTGACTCGACTGTTGCGAATGCATAGCGAAGAAAACGTGGTGACCCTTCAGGACAAACTGAGAAACGCAGCCATATGCATCAAGCACATCGACAAGTGGATTCTTAACAAGAGGGGGTTGACCCTACCGGAAGAAGGGTACCCATCGGAAGGGTACCCCCCAGAAGAGTACCCCCCGGAGGAACTCCTCAAAGAAATCGAGAAGGAAAAAAGCGCTCTGAATGATGAAGCGTTCGCTAAAGATACCAACGGAGTCATCCACCTGGATAAGCCTCCCAACGAAATGAAATTTAAATCCCCCTATTTTAAAAAGAGCAAATACTGTAACAATGAGTACTGTGATAGGTGGAAAGATAAAACGAGTTGCATGTCAAATATAGAAGTGGAAGAGCAAGGGGATTGCGGGCTCTGTTGGATTTTCGCCTCTAAGTTACACTTAGAAACGATCAGGTGCATGAGAGGGTATGGCCACTTCCGCAGCTCCGCTCTGTTTGTGGCCAACTGCTCGAAGAGGAAGCCAGAAGATAGATGCAACGTGGGTTCTAACCCTACAGAGTTTCTTCAAATTGTTAAGGACACGGGATTTTTACCTCTAGAGTCCGATCTCCCCTACAGCTATAGCGACGCGGGGAACTCCTGCCCCAATAAAAGAAACAAGTGGACCAACCTGTGGGGGGATACCAAACTGCTGTATCATAAGAGACCCAATCAGTTTGCACAAACACTCGGGTACGTTTCCTACGAAAGCAGTCGCTTTGAGCACAGCATCGACCTCTTCATAGACATCCTCAAAAGGGAAATTCAAAACAAAGGCTCCGTTATCATTTACATAAAAACCAACAATGTCATCGATTATGACTTTAATGGAAGAGTCGTCCACAGCCTATGTGGCCATAAGGATGCAGATCATGCCGCTAACCTGATCGGTTATGGTAACTACATCAGTGCTGGTGGGGAGAAGAGGTCCTATTGGATTGTGCGAAACAGCTGGGGGTACTACTGGGGAGATGAAGGCAACTTTAAGGTTGACATGTACGGCCCGGAGGGATGCAAACGGAACTTCATCCACACGGCTGTTGTGTTTAAGATAGACCTGGGCATCGTCGAAGTCCCGAAGAAGGACGAGGGGTCCATTTATAGCTACTTCGTTCAGTACGTCCCCAACTTTTTGCACAGCCTTTTCTACGTGAGTTACGGTAAGGGTGCTGATAAGGGAGCGGCGGTGGTGACAGGGCAGGCGGGAGGAGCGGTAGTCACAGGACAGACTGAAACGCCCACTCCGGAGGCCGCTAAAAATGGGGATCAGCCAGGAGCACAGGGTAGCGAGGCAGAAGTCGCGGAGGGTGGCCAGGCAGGAAATGAAGCCCCGGGAGGGTTGCAAGAGAGTGCTGTTTCGTCGCAAACGAGTGAGGTTACGCCGCAATCTAGTATAACTGCTCCGCAAATCGGTGCAGTTGCCCCACAAATCGGTGCAGCTGCCCCACAAATCGATGTAGCCGCCCCACAAATCGATGTAGTCGCCCCACAAACGAGGTCCGTTGACGCCCCCCAAACGAGCTCGGTTGCCGCCCACCCCCCAAACGTGACGCCGCAGAACGTGACGCTTGGGGAGGGCCAGCACGCGGGGGGTGTAGGCTCCCTCAT CCCCGCGGACAAC X9GAAACCCTGCTAGACAGCGAAACGTTAAAGAACTACGAAAAGGAAACGAACGAATACATTCGCAAAAAAAAAGTGGAGAAACTGTTCGATGTTATTTTAAAAAATGTTCTGGTAAACAAACCGGAAAATGTATACCTGTACATATACAAGAACATTTATTCCTTCCTTTTGAACAAAATTTTTGTGATCGGCCCTCCTTTGCTGAAAATTACTCCCACCTTATGTTCTGCGATTGCCAGCTGCTTTAGCTACTACCACCTCAGCGCCTCGCACATGATCGAGTCTTACACTACTGGTGAAGTAGATGACGCTGCAGAGAGTTCCACAAGCAAAAAGTTAGTCAGTGACGACTTAATCTGCTCCATCGTTAAAAGCAACATAAACCAGCTGAACGCGAAGCAAAAGCGGGGGTATGTAGTCGAAGGGTTCCCCGGCACCAATCTTCAGGCAGACAGTTGCCTACGGCATTTGCCATCTTACGTTTTTGTCCTGTACGCCGACGAAGAGTACATTTATGACAAGTACGAACAAGAGAACAACGTAAAAATTCGTTCAGACATGAACAGCCAAACTTTTGATGAAAACACACAGTTGTTCGAAGTGGCCGAGTTCAACACGAATCCGCTGAAGGATGAGGTAAAGGTCTACTT AAGGAAC X10TATCCAAAAAAGAACTCGACAAACCCGACCCAACTTCCCCATACCAAGGACAATATGGAGAGTCTGAGGAACAAAGACAAGGTTATGGAATCCCCCCCAACCCAACCATGATTAACCTTACTGGTAACCAAGACCAACGACCAAATGTATTGCAACAATTTGGAATAAACAACAAAAATGTAATGCAGTTTTTAATAAACATGTTTGTGTACGTTGCTGCTATATTAGTTAGTTTAAAAATATGGGACTACATGTCTTACAGCAAATGTGATTATTACAAAGATTTATTATTAAGAATTGTAAGATACCAATCACACATGAATGATGGTAAGATGGCC X11AGCCGCATCGACAAGCAGCCCATCCAGAGCAGCTACCTCTTCCAGGATAACGCAGTCCCGCCTGTTCGATTCTCCGCAGTAGATGCAGACCTGTTTTCCATTGGAGTAGTTCACACAGAGGAGCAAATATTTATGGACGACGCCAACTGGGTGATTAGCAGCGTGCCCAGTAAGTACCTGAACTTGCATCTACTCAAAACGGGTTCTAGACCCCATTTTTCGCACTTCTCCGTATCTATGAACACGGGTTGCAACCTATTCCATCGCTTCCACCGGTGGGGGAAACCTTCCCCTTGAGTCCCTCCAAAGATGGAGCGACGTGGAAAGCATTTGAAACGGACGACAGTGTAGAGGTGATTCACAGAGAGACGAAGGAAAAGAGAATCTATAAGCTCAAGTTCATTCCTCTGAAGAGTGGGGCTCTCCTAAAGGTTGACGTTTTGAAGGGAATTCCCTTTTGGGTTATCTCACAAGGGAGGAAAATCCTACCAACGATTTGTTCTGGAGATGAGGAGGTGCTATCAAACCCACAGAATGAGGTCTTCAAAGAGTGCACATCGTCGAGTAGTCTCTCTCCCGAATTTGATTGTCTAGCCGGGCTGAGCACCTACCATAGGGATAAGAAGAACCACACGTGGAAAACCTTCTAGCGGATCTATAGGTCAGTTTATAAAGATCTTCTTCAATAAGCCCGTACAAATTACCAAGTTTAGGTTTAAGCCCAGAGACGACCTGCTGTCTTGGCCCTCCGAAGTAGCTCTCCAATTCGATACCGATGAGGAGGTGATCATACCAATTCTGCATACGCACAATATGGGGCAGAACACGACTAGGCTAGAACACCCAATCATCACCACCTCTGTTAAGGTAGAAGTGAGAGACATGTACGAACGGGCAAGTGAAAATACAGGAGGTTCTTTCGAGGTAATTGGAAGCACATGCCAGATGATGGAAGACGACTACATGACGCACCATGCTGTTATAGACATCACCGAGTGTGATCGTAGGTTGGAGTCCCTCCCAGATGTTATGCCCTTAACGAAGGGGAGCAAATTTCTGGCCATTTGTCCCCGCCCCTGCTTGAGCAGCTCCAATGGGGGAGTCATTTACGGGTCAGATGTTTATTCCACAGATTCTGCCGTATGTGGGGCGGCCGTACACGCGGGGGTGTGCAGCCGTGAGGGGGAGGGCAGCTGCCACTTCCTCGTTGTGGTGCGCGGCGGGCGGGCCAACTTCGTGGGGGCTCTCCAGAACAACGTCCTGTCTCTCAGTCGGGGTGGTGGCGGTAGCGGTAGCGGTAGCTCCACCAGTAGCGATGGCGATGGCGATAGCGATAGCTCCACCAGTAGGGCCAACTTCTCATTTTCCCTCTCCAGTGCGTCAGGGTTCGGGGGGGGTCCGCGCGGGGCCCACGCAGAAGCCGCGCCAAGCAGCTACTCCATTGTGTTCAAGCCGAGGGACCATTTGGCTCCAACGAACGGCTTTCTAGTAGACTCAGGGAGAGAGTTCACCAGCTACGGAAGCGTTGCCTACGGATGGAAGAGGGAGGTTTCTCCTTCGTCCTTCTTTTTCCTCTCCTTCTCCTAGCTACACTTCCCCCCCGTTGGAAGAACCGACGCTGCTTAGGGGGGACTCCTCCTCATTCAATGGGATTTACTCCGGGGGGATAGAATTCCCCCCCGCCTCGGCTAGCCAAAATTGCATTTCCCAACTGGATTGCCAGACCAACyrCTGGAAGTTTCAGATGCAAGAAAATGGCACCTACTTTGTGCAGGTGCTAGTGGGGAATAAAACTTCCCCTGAGAAGCAGAAGGCCTTCGTCGAGCTGAATGGCGTTCCCATCATAAAGGGGGTGGACCTTGGCCCAGACGAGGTCTTCGTCGCCACTGACCGCGTGCAGGTGACGAACCGGGCCCTCGTCCTCACGTCCACTTGCCTGGGCGGCGAGAGTGCCTGCTCGCGGGCGCGCGTCAGCATCATGGCGGTCCAGATTGTGAAGA CG X12AACGGTATGAATAAAGACAAAGACGCAGAGATTACTCCCCCTCCGTTCATCGTCTTGCCGGGTGGAAAAAAAATCCACATGCTGCAAAGCGAATACGAGTATGACGTTCTGCGGGATATGTACCGAACGGATGAGGCGAATGGGGGAAGTGGTGAGAAGGAGAGTCACCCCTCTGGGGATGGTGCAATCAGAAGAAACGAATTTTTTAAACTTTTTTCACCACAGGGAGGGTCATTATAAGTTTGTTATCAAAAATGTTCCCACCAAATTGAGCGACCTTTTGCAGAAAGGTGGCAACGAACAGGAGACAGACCTAVTTCCTCTTTTATACAGGAGTCTGCAATTCGCATGCAGCGCAGACGGGACGTGGCCATATGCCAGAAGAGAGGTGGCCTTTTTTAAAAACGGGAGCGTCCACTGCGAAGCGGAATTTCAAAACGAGTTATCAGTGAGGAGAACCCCCCGAAGTGGGAAGAAATCATTTGGACGTTTTCCAAGGGGGACACTAATAAAAAGTAGCGACCTGAGGAGCAAAATTGTGGAGGGGAATTCTTATGATAAAAGGGCCGCACCCCTGAAGAGTGAAAAAAAAAAGAAGGCTCTCTTTTTACACCCAGAAAGTGTGCTATACAAAATGGAAGAAATATTTTTTTATGAAAATCCAAGTGTCAAAAGTGAAATTGTCGCATTTTGTTCTTTTTCATGATGTTGTCTCACAGTAACGTCCTTAGGACATGGAGCACATCCCGTTAACTCCCCCTTTTTGGGAAGCGACCTGCTGGAGATGATATTTGGCTACTGCATTTTACACGGGTTTAAAAAAATCAGAGTGAAAAGCGAATCCTTAAATTACGAAACTGGGATAAGGACCTCATTCATTGAGATTTTACTCAACGGAAAAACAGCACTTGAACATTTAGGGTTAAGACTTACAAACGTAGCGAAGTTTTCTAAAGAACTGTATTATGTAATCACTGGGTATACGTGGAAAAGTGATTTGGTGCTATCACCCATAGTAAGGTTTGAACATGATTTATACGTGCATCACGACATAGAGGAGCGATTTTTCCTTTACGTGAATAAAATGTATAGGAATATGCTCCACGATTTC1TCCTTCTCTTGTGATGAAAATTATTATCCTTATAAAAATTGTTATGACATCTACCCCTCCGTGAGAAGGAGTCAAAATAATCTTTGTCTCTTCGAACTGAATCCCATATATGAAGAATTGAAGGAGCTCTTTCCAGACTCTTGTAATATTGGCCAACGCGTTAGAAAATGCTATGAGGAGATAAAAAAAAACGTTGTCTGCACACATAACGGTGAAGGAGGAGAAGACGGATGTAAGTACTACCAATTTATTGTAAATACATTCATAAAGCCGAGGAGGAAAACGTCGTTTTTTTVTTTTVTCACAATATGTATGTACAGGAATATCTTTCAAAGAAATCCTACCCCTATTACTTGCTACTCAGTGAGGTTATAAAAAATGAAGAAAATAACTTTCTCGAAAAAGGCAACTACGACTTAGTGGCCGATGCACAGACGCACCTCTTCTTAAATTACGTTTTGCAAAATTCTACCTTTTTTATCTTTTGGAATTTCTCTACCGAATELTGGAAAAGGTTTCGGTACATCCAGGCTGGCCCAACCGGGGCCACTTCCACACCGCAGAAGGGGCAAGCTGTGTTTTGCCCCATGGCCTATGCGTACGAATTTGTGGAGCACCTCGACACGTTTTATGTGAGG GGG V6TCCGTTGAAGAGGCTAAAAAAAATACTCAGGAAGTTGTGACAAATGTGGACAATGCTGCTCTAAATCTTCAGGCCACCAATTCAAATCCGATAAGTCACTCCTGTAGATAGTAGTAAAGCGGAGAAGGTTCCAGGAGATTCTACGCATGGAAATGTTAACAGTGGCCAAGATAGTTCTACCACAGGTAAAGCTGTTACGGGGGATGGTCAAAATGGAAATCAGACACCTGCAGAAAGCGATGTACAGCGAAGTGATATTGCCGAAAGTGTAAGTGCTAAAAATGTTGATCCGCAGAAATCTGTAAGTAAAAGAAGTGACGACACTGCAAGCGTTACAGGTATTGCCGAAGCTGGAAAGGAAAACTTAGGCGCATCAAATAGTCGACCTTCTGAGTCCACCGTTGAAGCAAATAGCCCAGGTGATGATACTGTGAACAGTGCATCTATACCTGTAGTGAGTGGTGAAAACCCATTGGTAACCCCCTATAATGGTTTGAGGCATTCGAAAGACAATAGTGATAGCGATGGACCTGCGGAATCAATGGCGAATCCTGATTCAAATAGTAAAGGTGAGACGGGAAAGGGGCAAGATAATGATATGGCGAAGGCTACTAAAGATAGTAMAATAGTTCAGATGGTACCAGCTCTGCTACGGGTGATACTACTGATGCAGTTGATAGGGAAATTAATAAAGGTGTTCCTGAGGATAGGGATAAAACTGTAGGAAGTAAAGATGGAGGGGGGGAAGATAACTCTGCAAATAAGGATGCAGCGACTGTAGTTGGTGAGGATAGAATTCGTGAGAACAGCGCTGGTGGTAGCACTAATGATAGATCAAAAAATGACACGGAAAAGAACGGGGCCTCTACCCCTGACAGTAAACAAAGTGAGGATGCAACTGCGCTAAGTAAAACCGAAAGTTTAGAATCAACAGAAAGTGGAGATAGAACTACTAATGATACAACTAACAGTTTAGAAAATAAAAATGGAGGAAAAGAAAAGGATTTACAAAAGCATGATTTTAAAAGTAATGATACGCCGAATGAAGAACCAAATTCTGATCAAACTACAGATGCAGAAGGACATGACAGGGATAGCATCAAAAATGATAAAGCAGAAAGGAGAAAGCATATGAATAAAGATACTTTTACGAAAAATACAAATAGTCACCATTTAAAT V7ATACGGAATGGAAACAACCCGCAGGCATTAGTTCCTGAAAAGGGCGCTGACCCGAGTGGGGGCCAGAACAACCGCTCCGGAGAAAACCAAGACACGTGCGAAATTCAAAAGATGGCCGAAGAAATGATGGAAAAAATGATGAAGGAAAAAGACGTGTTTAGCTCCATCATGGAACCTCTCCAGAGCAAATTAACTGACGATCATCTGTGTTCAAAAATGAAATATACGAACATTTGTCTTCACGAAAAGGACAAAACTCCCTTGACCTTCCCCTGCACAAGTCCGCAGTACGAACAGCTAATTCATCGCTTCACTTATAAAAAGTTGTGCAACTCCAAGGTGGCCTTTAGCAACGTCTTGCTCAAATCCTTCATCGATAAAAAAAATGAAGAAAACACATTTAACACGATCATACAGAATTACAAAGTTCTGTCCACTTGCATTGACGATGATTTGAAGGACATTTATAATGCATCCATAGAGTTATTCTCCGACATAAGAACCTCCCTTCACAGAAATTACCGAAAAGTTGTGGTCCAAAAATATGATCGAAGTTTTAAAGACAAGAGAGCAAACATTGCAGGCATTTTATGTGAGTTAAGAAATGGAAATAATTCTCCCCTAGTATCGAACAGTTTTTCCTATGAAAATTTTGGAATTCTCAAGGTTAATTATGAGGGATTACTAAACCAGGCGTATGCGGCCTTTTCAGACTACTATTCATACTTTCCCGCTTTTGCCATTAGCATGTTAGAAAAGGGAGGGTTGGTCGACCGCTTGGTCGCCATCCATGAGAGCTTGACCAACTACAGGACGAGAAATATTCTCAAGAAGATCAATGAGAAGTCCAAAAATGAGGTCCTCAATAATGAAGAAATTATGCACAGCTTGAGCAGTTACAAGCACCATGCCGGGGGCACGCGTGGCGCCTTCCTGCAGTCCAGAGATGTGCGCGAAGTTACGCAAGGAGATGTGAGCGTTGATGAGAAGGGCGACCGGGCCACCACCGCGGGGGGCAACCAAAGCGCAAGCGTGGCTGCGGCGGCCCCGAAGGATGCGGGCCCAACCGTGGCTGCTCCTAACACTGCTGCTACGCTCAAAACGGCTGCTTCCCCCAACGCGGCTGCTACTAACACTGCTGCTCCCCCCAACATGGGTGCCACCTCCCCGCTGAGCAACCCCCTGTACGGCACCAGCTCCCTGCAGCCAAAGGACGTCGCGGTGCTGGTCAGAGATCTGCTCAAGAACACGAACATCATCAAGTTCGAGAATAACGAACCGACTAGCCAAATGGACGATGAAGAAATTAAGAAGCTCATTGAGAGCTCCTTTTTCGACTTGAGCGACAACACCATGTTAATGCGGTTGCTCATAAAGCCGCAGGCGGCCATCTTACTAATCATTGAGTCCTTCATTATGATGACGCCCTCCCCCACGAGGGACGCCAAGACCTATTGCAAGAAAGCCCTAGTTAATGGCCAGCTAATCGAAACCTCAGATTTAAACGCGGCGACGGAGGAAGACGACCTCATAAACGAGTTTTCCAGCAGGTACAATTTATTCTACGAGAGGCTCAAGCTGGAGGAGTTG V8AAGGAGTACTGCGACCAGCTTAGCTTTTGCGATGTGGaTTGACACACCACTVTGATACGTAVTGTAAGAATGACCAGTACCTGTTCGTTCACTACACTTGTGAGGACCTCTGCAAAACGTGTGGCCCTAATTCGTCCTGCTACGGAAACAAGTACAAACATAAGTGCCTGTGCAATAGCCCCTTCGAGAGTAAAAAGAACCATTCCATTTGCGAAGCACGAGGTAGCTGCGATGCACAGGTATGCGGCAAGAATCAAATTTGCAAAATGGTAGACGCTAAAGCAACATGCACATGTGCAGATAAATACCAAAATGTGAATGGGGTGTGTCTACCGGAAGATAAGTGCGACCTTCTGTGCCCCTCAAACAAATCGTGCCTGCTGGAAAATGGGAAAAAAATATGCAAGTGCATTAATGGGTTGACTCTACAGAACGGCGAGTGCGTCTGCTCGGATAGCAGCCAAATTGAAGAAGGACACCTCTGTGTCGCCCAAGAATAAATGTAAACGGAAGGAGTACCAACAGCTCTGCACCAATGAGAAGGAACACTGTGTGTATGATGAGCAGACGGACATTGTGCGGTGCGACTGCGTGGACCACTTCAAGCGGAACGAACGGGGAATTTGCATCCCAGTCGACTACTGCAAAAATGTCACCTGCAAGGAAAATGAGATTTGCAAAGTTGTTAATAATACACCCACATGTGAGTGTAAAGAAAATTTAAAAAGAAATACTTAACAATGAATGTGTATTCAATAACATCTGTGTCTTGTTAATAAAGGGAACTGCCCCATTGATTCGGAGTGCATTTATCACGAGAAAAAAAGGCATCAGTGTTTGTGCCATAAGAAGGGCCTCGTCGCCATTAATGGCAAGTGCGTCATGCAGGACATGTGCAGGAGCGATCAGAACAAATGCTCCGAAAATTCCATTTGTGTAAATCAAGTGAATAAAGAACCGCTGTGCATATGTTTGTTTAATTATGTGAAGAGTCGGTCGGGCGACTCGCCCGAGGGTGGACAGACGTGCGTGGTGGACAATCCCTGCCTCGCGCACAACGGGGGCTGCTCGCCAAACGAGGTTTGCACGTTCAAAAATGGAAAGGTAAGTTGCGCCTGCGGGGAGAACTACCGCCCCAGGGGGAAGGACAGCCCAACGGGACAAGCGGTCAAACGGGGGGAAGCGACCAAACGGGGTGACGCGGGTCAGCCCGGGCAGGCGCACTCAGCAAATGAGAACGCGTGCCTGCCCAAGACGTCCGAGGCGGACCAAACCTTCACCTTCCAGTACAACGACGACGCGGCCATCATTCTCGGGTCCTGCGGAATTATACAGTTTGTGCAAAAGAGCGATCAGGTCATTTGGAAAATTAACAGCAACAATCACTTTTACATTTTTAATTATGACTATCCATCTGAGGGTCAGCTGTCGGCACAAGTCGTGAACAAGCAGGAGAGCAGCATTTTGTACTTAAAGAAAACCCACGCGGGGAAAGTCTTTTACGCCGACTTTGAGTTGGGTCATCAGGGATGCTCCTACGGAAACATGTTTCTCTACGCCCACCGGGAGGAGGCT V9AGCAAAAACATTATTATTCTGAACGATGAAATTACCACCATTAAAAGCCCGATTCATTGCATTACCGATATTTATTTTCTGTTTCGCAACGAACTGTATAAAACCTGCATTCAGCATGTGATTAAAGGCCGCACCGAAATTCATGTGCTGGTGCAGAAAAAAATTAACAGCGCGTGGGAAACCCAGACCACCCTGTTTAAAGATCATATGTGGTTTGAACTGCCGAGCGTGTTTAACTTTATTCATAACGATGAAATTATTATTGTGATTTGCCGCTATAAACAGCGCAGCAAACGCGAAGGCACCATTTGCAAACGCTGGAACAGCGTGACCGGCACCATTTATCAGAAAGAAGATGTGCAGATTGATAAAGAAGCCTTTTGCGAACAAAAACCTGGAAAGCTATCAGAGCGTGCCGCTGACCGTGAAAAACAAAAAATTTCTGCTGATTTGCGGCATTCTGAGCTATGAATATAAAACCGCGAACAAAGATAACTTTATTAGCTGCGTGGCGAGCGAAGATAAAGGCCGCACCTGGGGCACCAAAATTCTGATTAACTATGAAGAACTGCAGAAAGGCGTGCCGTATTTTTATCTGCGCCCGATTATTTTTGGCGATGAATTTGGCTTTTATTTTTATAGCCGCATTAGCACCAACAACACCGCGCGCGGCGGCAACTATATGACCTGCACCCTGGATGTGACCAACGAAGGCAAAAAAGAATATAAATTTAAATGCAAACATTTGAGCCTGATTAAACCGGATAAAAGCCTGCAGAACGTGGCGAAACTGAACGGCTATTATATTACCAGCTATGTGAAAAAAGATAACTTTAACGAATGCTATCTGTATTATACCGAACAGAACGCGArrGTGGTGAAACCGAAAGTGCAGAACGATGATCTGAACGGCTGCTATGGCGGCAGCTTTGTGAAACTGGATGAAAGCAAAGCGCTGTTTATTTATAGCACCGGCTATGGCGTGCAGAACATTCATACC CTGTATTATACCCGCTATGAT

TABLE 6 references associated with proteins Protein 5′ position aminoacid Code to 3′ (bp) position reference X1  (4-1845) Lu J Proteomics2014 X2 (67-1161) Lu J Proteomics 2014 X3 (70-555)  Lu J Proteomics 2014X4 (4-948) Lu J Proteomics 2014 X5 (73-1659) Lu J Proteomics 2014 X6(73-1074) Lu J Proteomics 2014 X7 (1384-2190)  Lu J Proteomics 2014 X8(559-2871)  Lu J Proteomics 2014 X9 (4-660) Lu J Proteomics 2014 X10(4-342) Lu J Proteomics 2014 X11 (1264-3261)  Lu J Proteomics 2014 X12(1957-3702)  Lu J Proteomics 2014 V1 140 to 1275 Hietanen 2015 Infectionand Immunity PMID: 26712206 V2 160 to 1135 Hietanen 2015 Infection andImmunity PMID: 26712206 V3 161 to 1454 Hietanen 2015 Infection andImmunity PMID: 26712206 V4 501 to 1300 Hietanen 2015 Infection andImmunity PMID: 26712206 V12 160 to 1170 Hietanen 2015 Infection andImmunity PMID: 26712206 V5 161-641 Franca 2017 Elife PMID: 28949293 V11Region II Franca 2017 Elife PMID: 28949293 V10 Region II V13 Region IIV6 (1522-2697)  V7 (29-551)  V8 (552-1075)  V9 (30-366) 

Appendix IIIA

Area Under Curve (1 antigen) Top 1% of 2 antigen combis Top 1% of 3antigen combis Top 1% of 4 antigen combis (<9m GMT)/(12m GMT) (<9mGMT)/(-ve control GMT) Thailand Brazil Solomons Thailand Brazil SolomonsThailand Brazil Solomons Thailand Brazil Solomons Thailand BrazilSolomons Thailand Brazil Solomons RBP2a

L01

L31 0.805 0.762 0.766 0 0 0 2.6 2.6 2.3 5 2.7 3.7 3.9 3.05 2.56 8.6212.32 5.1 X087885 0.807 0.748 0.697 5.9 0 0 16.7 4.7 7 20.3 9.2 14.64.28 1.79 1.2 9.82 34.44 15.93 PvEBP 0.747 0.739 0.707 0 0 0 1.8 1.8 1.85 2.4 3.1 6.53 5.18 2.01 21.12 8.91 2.61 L55 0.79 0.781 0.643 5.9 5.9 014.6 12.3 1.5 17.2 20.9 2.6 4.94 4.42 1.95 7.9 7.91 1.19 PvRipr 0.7540.772 0.646 0 0 5.9 1.8 5.6 2 3 9.1 3.1 5.01 4.32 2.57 7.02 7.89 1.07L54 0.79 0.727 0.654 5.9 0 0 3.5 2.6 1.8 5.6 4.4 3.1 4.4 2.98 1.88 5.393.82 1.3 L07 0.747 0.765 0.599 0 0 0 2.3 4.7 1.8 3.1 5.3 2.5 2.56 3.111.45 4.3 6.29 1.35 L30 0.732 0.61 0.609 0 0 0 1.2 2.3 2.9 2.3 3.8 5.44.14 1.53 1.55 13.36 2.24 1.79 PVDBPII 0.74 0.773 0.639 0 0 5.9 0.6 3.23.2 1.7 2.6 4 2.76 4.89 1.79 5.14 15.42 1.34 L34 0.767 0.746 0.67 0 0 03.8 7.3 0.6 4.5 16.6 2.2 3.22 2.99 1.84 3.87 4.78 1.46 X092995 0.7920.703 0.642 5.9 0 0 13.7 1.5 2 11.5 1.9 5.6 2.88 1.41 1.03 4.64 8.554.19 L12 0.755 0.731 0.637 5.9 0 0 3.2 3.8 1.8 3.5 6.1 2.9 3.19 2.731.46 3.81 3.47 1.8 rBP1b 0.533 0.578 0.525 5.9 5.9 0 17.5 4.1 1.2 24.14.7 2.5 1.23 1.44 1.11 0.67 0.79 0.84 L23 0.759 0.753 0.

0 0 0 1.5 7 1.2 4 14.8 2.9 2.95 2.67 1.86 4.3 5.09 1.59 L02 0.746 0.7240.677 0 0 0 1.5 2.3 2.3 2.7 3.7 3.9 3.7 3 1.76 3.89 4.07 1.82 L32 0.7050.651 0.

0 0 5.9 1.8 1.2 17 3.7 1.9 30.2 2.79 3.17 1.61 2.24 0.81 0.31 L28 0.7590.755 0.667 5.9 0 0 2.6 1.2 1.2 3.8 2.5 2.6 2.92 2.44 1.43 5.74 5.242.14 L19 0.758 0.67 0.654 0 0 0 1.5 0.9 3.2 2.6 2.3 6.5 3.66 2.18 1.096.58 3.11 4.89 L36 0.727 0.698 0.682 0 0 0 1.5 0.9 2 3.2 1.8 2.8 2.952.44 1.99 3.28 3.2 1.8 L41 0.702 0.66 0.686 0 0 0 1.5 0.6 2 2.3 1.7 3.82.12 1.91 1.72 4.99 3.03 1.9 X088820 0.723 0.666 0.633 5.9 0 0 4.4 0.63.8 4 1.8 6.7 1.9 1.28 0.99 4.04 8.58 5.87 PvDBP.Sa 0.716 0.751 0.616 00 5.9 0.3 2.6 8.8 1.7 2.6 7.2 3.01 4.78 1.85 3.96 12.35 0.83 RBP2a 0.6920.731 0.662 0 0 0 3.5 1.2 0.9 5.4 1.8 1.6 2.42 2.49 1.47 2.46 4.6 1.5L18 0.736 0.663 0.622 0 0 0 2.3 2 2.3 3.1 4.5 3.8 2.22 1.41 0.93 2.532.33 4.31 RBP2cNB 0.744 0.7 0.551 0 0 5.9 1.5 1.2 11.1 3.6 1.9 6.6 3.022.3 1.57 3.87 3.23 0.64 L27 0.735 0.663 0.585 0 0 5.9 2.9 1.5 2 4.5 2.42.7 2.34 2.24 1.66 1.67 1.2 0.63 L42 0.697 0.632 0.593 0 0 0 1.5 0.9 22.9 1.8 3 2.81 1.91 1.85 4.44 2.89 1.19 L14 0.701 0.637 0.581 0 0 0 3.51.2 1.5 4.1 2 3.1 1.94 1.51 1.33 2.85 2.23 1.07 X099930 0.71 0.63 0.5735.9 0 0 3.8 0.9 1.5 4.1 1.7 2.5 1.75 1.27 0.94 2.85 3.15 2.07 PvDBP.R30.685 0.67 0.554 0 0 5.9 2 1.2 2.6 4.1 3 2.7 2.51 2.19 1.73 2.57 3.110.51 L22 0.725 0.622 0.562 0 5.9 0 2.3 4.1 1.5 3 5.6 2.4 1.98 1.25 0.992.28 2.13 1.3 RBP1a 0.668 0.669 0.565 5.9 0 0 0 1.5 0.9 1.2 2.7 1.9 2.42.32 2.49 1.45 2.06 2.59 PvCYRPA 0.779 0.563 0.532 0 0 5.9 0.6 0.9 14 21.9 10.3 2.37 1.25 1.46 4.55 1.59 0.31 L10 0.719 0.588 0.553 0 5.9 0 1.26.1 1.2 2.4 9.3 2.3 2.14 1.31 1.04 3.61 1.39 1.43 L24 0.656 0.595 0.6050 5.9 0 5.3 2.9 1.2 5.5 5.6 2.8 2.01 1.33 0.88 1.75 1.71 5.03 L21 0.6530.597 0.602 0 0 0 1.5 1.8 1.8 3 2.6 4.1 2 1.55 0.93 1.47 1.35 3.08 L510.679 0.625 0.547 5.9 0 5.9 4.1 1.8 3.5 6.2 3.7 5.4 1.85 1.48 1.31 2.041.74 0.89 L25 0.67 0.593 0.58 0 5.9 0 0.9 2.5 0.9 2.1 6 2.8 1.61 1.140.96 2.04 1.76 2.05 L33 0.65 0.608 0.584 0 0 0 1.8 1.2 0.9 3.7 3.1 1.61.83 1.43 1.37 1.63 1.82 1.05 L20 0.674 0.619 0.544 0 0 0 1.5 1.2 1.52.7 2.1 2.9 1.71 1.31 1.23 2.2 2.08 0.82 X114330 0.666 0.594 0.577 0 0 01.5 1.2 1.5 2.2 2.6 3 1.44 1.15 1.03 2.35 2.2 1.78 L50 0.713 0.604 0.4940 5.9 5.9 1.2 6.4 11.1 2.9 8.6 7.3 2.15 1.55 1.4 2.53 1.34 0.45 L060.686 0.583 0.54 0 0 0 1.5 1.8 1.2 2.5 3.1 2.3 1.91 1.33 0.92 2.23 1.411.57 L05 0.686 0.607 0.499 0 0 0 2 2.3 2 3.9 4.7 3.4 2.23 1.44 1.03 2.11.9 0.72 X080665 0.678 0.595 0.522 0 5.9 0 1.5 3.8 1.2 2.1 6.2 3.6 1.81.25 0.9 2.64 1.8 1.21 L39 0.673 0.56 0.537 5.9 0 0 4.1 1.2 1.5 4 2.42.8 1.64 1.12 0.96 2.96 1.57 1.5 X094350 0.641 0.602 0.516 0 0 0 1.5 21.8 2.7 3.2 4.2 1.47 1.3 0.96 1.79 1.7 1.15 L11 0.652 0.594 0.49 0 5.95.9 3.8 4.4 5 5.3 7.7 10.7 1.58 1.29 0.96 1.67 1.29 0.92 L38 0.64 0.5430.552 0 5.9 0 1.2 5.3 1.5 3 6.3 2.6 1.59 1.2 1.19 1.18 1 0.89 L37 0.6280.608 0.487 0 5.9 5.9 2.6 2 3.2 5.1 3.7 4.9 1.54 1.6 1.15 1.17 0.92 0.73PvGAMA 0.646 0.57 0.495 0 0 5.9 2.3 1.2 6.7 5.3 2.5 6.5 1.64 1.49 1.321.45 0.74 0.53 L49 0.577 0.532 0.6 0 5.9 5.9 1.8 19.6 8.2 2.5 11.9 13.61.26 1.08 0.89 1.24 0.4 0.34 L47 0.641 0.513 0.539 0 5.9 5.9 0.9 5.8 4.71.9 6.8 4.8 1.52 1.29 1.21 1.73 0.51 0.38 L48 0.552 0.586 0.523 5.9 0 02.9 1.2 1.2 4.8 2.4 2.7 1.16 1.23 0.98 1.3 1.56 1.23 RBP2.P2 0.596 0.5440.515 5.9 5.9 5.9 5 14.6 17 6.5 8.9 24.9 1.48 1.34 1.16 0.94 0.66 0.46L03 0.579 0.503 0.566 5.9 5.9 0 2.6 2.3 2 3.8 4.1 4.4 1.59 1.14 0.930.82 0.8 0.51 L52 0.526 0.562 0.524 5.9 5.9 5.9 4.4 4.7 4.1 4.9 4.8 6.31.29 1.4 1.07 0.56 0.6 0.58 L40 0.564 0.55 0.495 0 0 0 1.8 1.5 1.2 3.32.7 3.2 1.23 1.01 0.91 1.08 1.79 1.09

indicates data missing or illegible when filed

Appendix IIIB

(<9m) > (>12m GMT + (<9m) > (-ve cont GMT + 2*ds(>12m) 2*sd(-ve c

) age trend age trend (P value) Thailand Brazil Solomons Thailand BrazilSolomons Thailand Brazil Solomons Thailand Brazil Solomons RBP2a 34.7 1947 70.8 64.4 45.7 1.02 0.63 1.06 0 0 0 L01 36.1 0 24.3 51.4 56.6 14.30.39 0.52 0.24 0 0 0.0043 L31 22.2 0 7.8 25 38 7.4 0.41 0.34 0.23 0 03.00E−04 X087885 15.3 7.8 5.7 41.7 81 50.9 0.53 0.13 −0.1 0 2.00E−040.0466 PvEBP 26.4 22.9 20 55.3 41 7.8 1.08 0.59 0.21 0 0 0 L55 27.8 17.113.9 38.9 29.8 3.5 0.48 0.46 0.44 0 0 0 PvRipr 25 15.1 23.5 31.9 29.34.8 0.55 0.42 0.2 0 0 0.0013 L54 23.6 16.1 14.3 26.4 19 2.2 0.48 0.330.24 0 0 0 L07 22.2 0 8.3 27.8 41.5 3.9 0.22 0.34 0.19 0 0 4.00E−04 L3023.6 9.8 10.9 47.2 11.7 9.6 0.85 0.16 0.05 0 2.00E−04 0.4217 PVDBPII15.3 19 10.4 20.8 47.3 3.5 0.4 0.63 0.1 0 0 0.076 L34 15.3 12.2 10.912.5 19 3.9 0.35 0.35 0.18 0 0 2.00E−04 X092995 12.5 3.4 1.7 15.3 34.110 0.33 0.09 −0.03 0 0.0034 0.4924 L12 23.6 12.7 5.2 16.7 15.1 3 0.360.22 −0.07 0 0 0.1928 rBP1b 2.8 4.4 4.3 0 0 0 −0.12 0.12 −0.06 0.0011.00E−04 0.1077 L23 9.7 13.7 11.7 12.5 19.5 5.7 0.29 0.22 0.1 0 0 0.0824L02 15.3 10.7 7.4 15.3 13.7 2.6 0.31 0.4 0.02 0 0 0.6554 L32 13.9 20.510 4.2 3.9 0.4 0.15 0.31 0.25 0.0016 0 1.00E−04 L28 18.1 12.7 8.3 45.833.2 9.1 0.46 0.32 0.26 0 0 0 L19 20.8 9.8 3.9 33.3 19.5 10.9 0.62 0.31−0.14 0 0 0.0036 L36 18.1 14.6 11.3 36.1 22 10.4 0.63 0.36 0.3 0 0 0 L419.7 9.3 7.8 29.2 17.6 8.3 0.39 0.41 0.32 0 0 0 X088820 12.5 0 0 15.335.6 14.8 0.17 0.07 −0.02 0 0.0032 0.5905 PvDBP.Sa 18.1 16.6 11.3 16.736.6 1.3 0.39 0.61 0.18 0 0 0.0016 RBP2a 18.1 13.2 9.1 18.1 22.4 3.5 0.30.34 0.1 0 0 0.0144 L18 15.3 3.4 4.3 11.1 6.3 10.4 0.11 0.08 −0.170.0022 0.0106 1.00E−04 RBP2cNB 23.6 16.6 10 18.1 17.6 1.7 0.43 0.35 0.440 0 0 L27 15.3 13.2 10 0 0 0 0.1 0.3 0.15 0.0021 0 3.00E−04 L42 16.712.7 16.1 29.2 20 7 0.5 0.3 0.27 0 0 0 L14 12.5 3.9 5.2 9.7 5.9 1.3 0.050.18 0.02 0.1401 0 0.6094 X099930 5.6 6.8 1.7 8.3 17.6 6.1 0.06 0.02−0.06 0.0734 0.4923 0.1513 PvDBP.R3 13.9 9.8 8.7 13.9 11.2 0.9 0.36 0.330.16 0 0 0.0047 L22 9.7 3.4 3 4.2 5.9 2.6 0.11 0.16 −0.08 0.0012 00.0611 RBP1a 18.1 16.1 10.4 8.3 18 1.3 0.36 0.44 0.12 0 0 0.0239 PvCYRPA16.7 0 4.8 29.2 11.7 0 0.43 −0.02 0.15 0 0.6208 0.0046 L10 8.3 4.4 312.5 4.4 1.3 0.47 0.16 −0.17 0 0 3.00E−04 L24 9.7 6.8 3.9 4.2 7.3 7 0.120.14 −0.21 0.0069 3.00E−04 0 L21 8.3 6.3 3.5 2.8 6.3 6.1 0.04 0.13 −0.190.3593 4.00E−04 0 L51 4.2 3.9 4.8 2.8 3.9 2.6 0.25 0.22 0.31 0 0 0 L2511.1 2.4 0.9 6.9 4.9 3.9 0.04 0.04 −0.15 0.3008 0.232 0.0025 L33 11.14.9 5.2 6.9 5.9 0.9 0.21 0.22 0.24 0 0 0 L20 9.7 0 4.3 0 0 0 0.01 0.110.02 0.7715 1.00E−04 0.7011 X114330 5.6 5.9 3 8.3 10.7 4.3 0.11 0.05−0.09 4.00E−04 0.103 0.054 L50 11.1 5.4 6.5 5.6 4.4 0.9 0.13 0.27 0.26.00E−04 0 0 L06 6.9 4.4 1.7 2.8 3.4 0.4 −0.03 0.01 −0.35 0.4684 0.69010 L05 12.5 8.8 3.5 5.6 9.8 0.4 0.13 0.15 −0.11 0.0018 1.00E−04 0.0232X080665 4.2 4.4 1.3 2.8 4.4 0.4 0.14 0.08 −0.09 7.00E−04 0.0263 0.0757L39 6.9 3.9 3.5 6.9 4.4 3.5 0.04 0.07 −0.15 0.2562 0.053 0.0064 X0943502.8 0 1.3 0 0 0 0.01 0.12 0.11 0.7336 0 0.0116 L11 6.9 3.4 2.6 1.4 2.4 00.16 0.1 −0.1 0 0.0027 0.0126 L38 6.9 3.4 3.9 0 0 0 −0.03 0.1 0.06 0.4650.0011 0.0898 L37 2.8 4.9 3.9 0 2.4 1.3 −0.03 0.16 0.05 0.3436 0 0.2103PvGAMA 9.7 6.8 9.1 6.9 2.9 0.9 0.19 0.14 0.05 0 0 0.1987 L49 9.7 3.9 3 00 0 −0.09 0 −0.21 0.0088 0.9079 2.00E−04 L47 12.5 4.4 5.2 5.6 1 0 0.020.15 −0.06 0.5816 0 0.3004 L48 0 0 3.5 0 0 0 −0.08 0 −0.14 0.0173 0.99390.0011 RBP2.P2 5.6 4.9 4.3 0 0 0 −0.01 0.13 −0.02 0.7196 0 0.5467 L032.8 0 3 1.4 4.4 0.4 −0.03 0.03 −0.16 0.4053 0.3609 2.00E−04 L52 1.4 5.93 0 0.5 0 −0.15 0.15 0.01 2.00E−04 0 0.8287 L40 9.7 0 0 0 0 0 −0.09 0.04−0.15 0.0058 0.1846 0.0018

indicates data missing or illegible when filed

Any and all references to publications or other documents, including butnot limited to, patents, patent applications, articles, webpages, books,etc., presented in the present application, are herein incorporated byreference in their entirety.

Example embodiments of the devices, systems and methods have beendescribed herein. As noted elsewhere, these embodiments have beendescribed for illustrative purposes only and are not limiting. Otherembodiments are possible and are covered by the disclosure, which willbe apparent from the teachings contained herein. Thus, the breadth andscope of the disclosure should not be limited by any of theabove-described embodiments but should be defined only in accordancewith claims supported by the present disclosure and their equivalents.Moreover, embodiments of the subject disclosure may include methods,systems and apparatuses which may further include any and all elementsfrom any other disclosed methods, systems, and apparatuses, includingany and all elements corresponding to target particle separation,focusing/concentration. In other words, elements from one or anotherdisclosed embodiments may be interchangeable with elements from otherdisclosed embodiments. In addition, one or more features/elements ofdisclosed embodiments may be removed and still result in patentablesubject matter (and thus, resulting in yet more embodiments of thesubject disclosure). Correspondingly, some embodiments of the presentdisclosure may be patentably distinct from one and/or another referenceby specifically lacking one or more elements/features. In other words,claims to certain embodiments may contain negative limitation tospecifically exclude one or more elements/features resulting inembodiments which are patentably distinct from the prior art whichinclude such features/elements.

1. A diagnostic test for Plasmodium vivax or Plasmodium ovale, todetermine a likelihood of a specific timing of infection by P. vivax orP. ovale in a subject by determining a level of antibodies to aplurality of antigens in a sample from the subject, wherein the level ismeasured of at least one antibody to a protein selected from the groupconsisting of PVX_099980 (L01), PVX_112670, PVX_087885, PVX_082650,PVX_088860, PVX_112680, PVX_112675, PVX_092990, PVX_091710, PVX_117385,PVX_098915, PVX_088820, PVX_117880, PVX_121897, PVX_125728, PVX_001000,PVX_084340, PVX_090330, PVX_125738, PVX_096995, PVX_097715, PVX_094830,PVX_101530, PVX_090970, PVX_084720, PVX_003770, PVX_112690, PVX_003555,PVX_094255, PVX_090265, PVX_099930, PVX_123685, PVX_002550, PVX_082700,PVX_097680, PVX_097625, PVX_082670, PVX_082735, PVX_082645, PVX_097720,PVX_000930, PVX_094350, PVX_000930, PVX_114330, PVX_088820, PVX_080665,PVX_092995, PVX_087885, PVX_003795, PVX_087110, PVX_087670, PVX_081330,PVX_122805, RBP1b (P7) , RBP2a (P9), RBP2b (P25) (PVX_094255), RBP2cNB(M5), RBP2-P2 (P55), PvDBP R3-5, PvGAMA, PvRipr, PvCYRPA, Pv DBPII (AH),PvEBP, RBP1a (P5) and Pv DBP (SacI).
 2. (canceled)
 3. The test of claim1, wherein the protein is selected from the group consisting of: (a)PVX_099980 (L01), PVX_112670, PVX_087885, PVX_082650, PVX_088860,PVX_112680, PVX_112675, PVX_092990, PVX_091710, PVX_117385, PVX_098915,PVX_088820, PVX_117880, PVX_121897, PVX_125728, PVX_001000, PVX_084340,PVX_090330, PVX_125738, PVX_096995, PVX_097715, PVX_094830, PVX_101530,PVX_090970, PVX_084720, PVX_003770, PVX_112690, PVX_003555, PVX_094255,PVX_090265, PVX_099930 and PVX_123685; (b) PVX_099980, PVX_112670,PVX_087885, PVX_082650, PVX_096995, PVX_097715, PVX_094830, PVX_101530,PVX_090970, PVX_084720, PVX_003770, PVX_112690, PVX_003555, PVX_094255,PVX_090265, PVX_099930 and PVX_123685; (c) PVX_099980, PVX_112670,PVX_087885 and PVX_082650; (d) RBP2b (PVX_094255) and PVX_099980 (L01);and (e) PVX_099980 (L01), PVX_112670, PVX_087885, PVX_096995,PVX_097715, PVX_094255, PVX_097625, PVX_097720, PVX_000930, PVX_092995,PvDBP R3-5, PvRipr, and PvEBP.
 4. The test of claim 3, wherein theprotein is selected from the group consisting of PVX_099980 (L01),PVX_112670, PVX_087885, PVX_096995, PVX_097715, PVX_094255, PVX_097625,PVX_097720, PVX_000930, PVX_092995, PvDBP R3-5, PvRipr, and PvEBP. 5.(canceled)
 6. (canceled)
 7. (canceled)
 8. (canceled)
 9. (canceled) 10.The test of claim 1, comprising determining a level of a plurality ofantibodies that bind to a plurality of antigens in a blood sample takenfrom the subject.
 11. The test of claim 10, comprising determining alevel of 2 to 17 antibodies.
 12. (canceled)
 13. The test of claim 1,wherein dynamics of the measured antibodies preferably include acombination of short-lived and long-lived antibodies.
 14. The test ofclaim 1, wherein the level of antibodies is measured at one time point,or at a plurality of time points.
 15. (canceled)
 16. The test of claim1, wherein antibody levels are measured in the subject according to atechnology providing a continuous measurement of antibody.
 17. The testof claim 16, wherein the technology is selected from the groupconsisting of bead-based assays (e.g. AlphaScreen® or Luminex®technology), the enzyme linked immuosorbent assay (ELISA), proteinmicroarrays and the luminescence immunoprecipitation system (LIPS). 18.An apparatus for diagnosis of P. vivax or P. ovale, comprising thediagnostic test of claim 1 and a reader for reading results of thediagnostic test, optionally adapted for portable use.
 19. (canceled) 20.The apparatus of claim 18, further comprising a transmitter fortransmitting said results.
 21. A system for diagnosis of P. vivax or P.ovale, comprising the apparatus of claim 18 and an analyser foranalysing the results of the diagnostic test.
 22. A method for diagnosisof P. vivax or P. ovale, comprising performing the diagnostic test ofclaim 1 to thereby identify individuals with a high probability of beinginfected with liver-stage hypnozoites.
 23. The test of claim 1, whereinsaid specific timing relates to an infection occurring within an elapsedtime period of 0 to 12 months.
 24. (canceled)
 25. The test of claim 23,wherein said time period is differentiated by month, by week, or by day.26. (canceled)
 27. (canceled)
 28. The test of claim 1, wherein aparticular time period is determined as a binary decision of a morerecent or an older infection, with each time point as a cut-off.
 29. Thetest of claim 28, wherein said cut off determines whether an infectionin a subject was within the past 9 months or later than the past 9months.
 30. (canceled)
 31. The test of claim 1, comprising furtherdetermining an estimate of the time since last P. vivax or P. ovaleblood-stage infection according to the time since last PCR-detectableblood-stage parasitemia, or as the time since last infective mosquitobite.
 32. (canceled)
 33. The test of claim 31 method, test, apparatus orcomprising determining a frequency of infections during a particulartime period and/or time since last infection.
 34. The test of claim 1for detecting a presymptomatic or asymptomatic infection by P. vivax orP. ovale.
 35. The test of claim 1 for detecting a dormant infection,wherein P. vivax or P. ovale is present in the liver but is not presentat significant levels in the blood.
 36. The test of claim 1 fordetecting antibodies to malarial proteins that are present in the bloodthat indicate a high degree of probability of liver-stage infection. 37.The test of claim 1 for determining progression of infection by P. vivaxor P. ovale in a population of a plurality of subjects.
 38. (canceled)39. The test of claim 1 for determining whether the infection isstarting or whether the infection has reached a peak in terms ofexposure of individuals who are naïve to the particular strain of P.vivax or P. ovale causing the infection.
 40. The test of claim 1 formeasuring antibodies in the blood of the subject at a plurality of timepoints to determine decay in the level of each antibody in the blood;and fitting such decay to a suitable model to determine at least oneinfection parameter.
 41. (canceled)
 42. The test of claim 40, whereindecay in the level of a plurality of different antibodies is determinedand the different antibodies are selected to have a range of differenthalf-lives.
 43. The test of claim 40, wherein from two up to twentydifferent antibodies are measured.
 44. (canceled)
 45. The test of claim1, wherein a model for determining at least one parameter about theinfection in the subject is selected from the group consisting of lineardiscriminant analysis (LDA), quadratic discriminant analysis (QDA),combined antibody dynamics (CAD), decision trees, random forests,boosted trees and modified decision trees.
 46. (canceled)